Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
2014-10-25 8:33 GMT+02:00 Pavel Stehule :

>
>
> 2014-10-25 8:19 GMT+02:00 Ali Akbar :
>
>> 2014-10-25 10:29 GMT+07:00 Ali Akbar :
>>
>>> I fixed small issue in regress tests and I enhanced tests for varlena
 types and null values.
>>>
>>> Thanks.
>>>
>>> it is about 15% faster than original implementation.
>>>
>>> 15% faster than array_agg(scalar)? I haven't verify the performance, but
>>> because the internal array data and null bitmap is copied as-is, that will
>>> be faster.
>>>
>>> 2014-10-25 1:51 GMT+07:00 Pavel Stehule :
>>>
 Hi Ali

>>>
 I checked a code. I am thinking so code organization is not good.
 accumArrayResult is too long now. makeMdArrayResult will not work, when
 arrays was joined (it is not consistent now). I don't like a usage of
 state->is_array_accum in array_userfunc.c -- it is signal of wrong
 wrapping.

>>> Yes, i was thinking the same. Attached WIP patch to reorganizate the
>>> code. makeMdArrayResult works now, with supplied arguments act as override
>>> from default values calculated from ArrayBuildStateArray.
>>>
>>> In array_userfunc.c, because we don't want to override ndims, dims and
>>> lbs, i copied array_agg_finalfn and only change the call to
>>> makeMdArrayResult (we don't uses makeArrayResult because we want to set
>>> release to false). Another alternative is to create new
>>> makeArrayResult-like function that has parameter bool release.
>>>
>>
>> adding makeArrayResult1 (do you have better name for this?), that accepts
>> bool release parameter. array_agg_finalfn becomes more clean, and no
>> duplicate code in array_agg_anyarray_finalfn.
>>
>
>  makeArrayResult1 - I have no better name now
>
> I found one next minor detail.
>
> you reuse a array_agg_transfn function. Inside is a message
> "array_agg_transfn called in non-aggregate context". It is not correct for
> array_agg_anyarray_transfn
>
> probably specification dim and lbs in array_agg_finalfn is useless, when
> you expect a natural result. A automatic similar to
> array_agg_anyarray_finalfn should work there too.
>
> makeMdArrayResultArray and appendArrayDatum  should be marked as "static"
>

I am thinking so change of makeArrayResult is wrong. It is developed for 1D
array. When we expect somewhere ND result now, then we should to use
makeMdArrayResult there. So makeArrayResult should to return 1D array in
all cases. Else a difference between makeArrayResult and makeMdArrayResult
hasn't big change and we have a problem with naming.

Regards

Pavel



>
>
>>
>>> next question: there is function array_append(anyarray, anyelement).
 Isn't time to define array_append(anyarray, anyarray) now?

>>>
>>> There is array_cat(anyarray, anyarray):
>>>
>>> /*-
>>>  * array_cat :
>>>  * concatenate two nD arrays to form an nD array, or
>>>  * push an (n-1)D array onto the end of an nD array
>>>
>>>  
>>> *
>>>  */
>>>
>>
>> Regards,
>> --
>> Ali Akbar
>>
>
>


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
2014-10-25 8:19 GMT+02:00 Ali Akbar :

> 2014-10-25 10:29 GMT+07:00 Ali Akbar :
>
>> I fixed small issue in regress tests and I enhanced tests for varlena
>>> types and null values.
>>
>> Thanks.
>>
>> it is about 15% faster than original implementation.
>>
>> 15% faster than array_agg(scalar)? I haven't verify the performance, but
>> because the internal array data and null bitmap is copied as-is, that will
>> be faster.
>>
>> 2014-10-25 1:51 GMT+07:00 Pavel Stehule :
>>
>>> Hi Ali
>>>
>>
>>> I checked a code. I am thinking so code organization is not good.
>>> accumArrayResult is too long now. makeMdArrayResult will not work, when
>>> arrays was joined (it is not consistent now). I don't like a usage of
>>> state->is_array_accum in array_userfunc.c -- it is signal of wrong
>>> wrapping.
>>>
>> Yes, i was thinking the same. Attached WIP patch to reorganizate the
>> code. makeMdArrayResult works now, with supplied arguments act as override
>> from default values calculated from ArrayBuildStateArray.
>>
>> In array_userfunc.c, because we don't want to override ndims, dims and
>> lbs, i copied array_agg_finalfn and only change the call to
>> makeMdArrayResult (we don't uses makeArrayResult because we want to set
>> release to false). Another alternative is to create new
>> makeArrayResult-like function that has parameter bool release.
>>
>
> adding makeArrayResult1 (do you have better name for this?), that accepts
> bool release parameter. array_agg_finalfn becomes more clean, and no
> duplicate code in array_agg_anyarray_finalfn.
>

 makeArrayResult1 - I have no better name now

I found one next minor detail.

you reuse a array_agg_transfn function. Inside is a message
"array_agg_transfn called in non-aggregate context". It is not correct for
array_agg_anyarray_transfn

probably specification dim and lbs in array_agg_finalfn is useless, when
you expect a natural result. A automatic similar to
array_agg_anyarray_finalfn should work there too.

makeMdArrayResultArray and appendArrayDatum  should be marked as "static"


>
>> next question: there is function array_append(anyarray, anyelement).
>>> Isn't time to define array_append(anyarray, anyarray) now?
>>>
>>
>> There is array_cat(anyarray, anyarray):
>>
>> /*-
>>  * array_cat :
>>  * concatenate two nD arrays to form an nD array, or
>>  * push an (n-1)D array onto the end of an nD array
>>
>>  
>> *
>>  */
>>
>
> Regards,
> --
> Ali Akbar
>


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Ali Akbar
2014-10-25 10:29 GMT+07:00 Ali Akbar :

> I fixed small issue in regress tests and I enhanced tests for varlena
>> types and null values.
>
> Thanks.
>
> it is about 15% faster than original implementation.
>
> 15% faster than array_agg(scalar)? I haven't verify the performance, but
> because the internal array data and null bitmap is copied as-is, that will
> be faster.
>
> 2014-10-25 1:51 GMT+07:00 Pavel Stehule :
>
>> Hi Ali
>>
>
>> I checked a code. I am thinking so code organization is not good.
>> accumArrayResult is too long now. makeMdArrayResult will not work, when
>> arrays was joined (it is not consistent now). I don't like a usage of
>> state->is_array_accum in array_userfunc.c -- it is signal of wrong
>> wrapping.
>>
> Yes, i was thinking the same. Attached WIP patch to reorganizate the code.
> makeMdArrayResult works now, with supplied arguments act as override from
> default values calculated from ArrayBuildStateArray.
>
> In array_userfunc.c, because we don't want to override ndims, dims and
> lbs, i copied array_agg_finalfn and only change the call to
> makeMdArrayResult (we don't uses makeArrayResult because we want to set
> release to false). Another alternative is to create new
> makeArrayResult-like function that has parameter bool release.
>

adding makeArrayResult1 (do you have better name for this?), that accepts
bool release parameter. array_agg_finalfn becomes more clean, and no
duplicate code in array_agg_anyarray_finalfn.


> next question: there is function array_append(anyarray, anyelement). Isn't
>> time to define array_append(anyarray, anyarray) now?
>>
>
> There is array_cat(anyarray, anyarray):
>
> /*-
>  * array_cat :
>  * concatenate two nD arrays to form an nD array, or
>  * push an (n-1)D array onto the end of an nD array
>
>  *
>  */
>

Regards,
-- 
Ali Akbar
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..f59738a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12046,6 +12046,22 @@ NULL baz(3 rows)
  
   

+array_agg
+   
+   array_agg(anyarray)
+  
+  
+   any
+  
+  
+   the same array type as input type
+  
+  input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input.
+ 
+
+ 
+  
+   
 average


diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 2f0680f..8c182a4 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
  array
 ---
  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+
+SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+ array
+---
+ {{1},{2},{3},{4},{5}}
 (1 row)
 
The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 41e973b..0261fcb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -108,12 +108,16 @@ exprType(const Node *expr)
 	type = exprType((Node *) tent->expr);
 	if (sublink->subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg("could not find array type for data type %s",
-			format_type_be(exprType((Node *) tent->expr);
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("could not find array type for data type %s",
+format_type_be(exprType((Node *) tent->expr);
+		}
 	}
 }
 else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
@@ -139,12 +143,16 @@ exprType(const Node *expr)
 	type = subplan->firstColType;
 	if (subplan->subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg("could not find array type for data type %s",
-	format_type_be(subplan->firstColType;
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("could not find array type for data type %s",
+		format_type_be(subplan->firstColType;
+		}
 	}
 }
 else i

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-24 Thread Amit Kapila
On Fri, Oct 24, 2014 at 4:05 PM, Andres Freund 
wrote:
>
> On 2014-10-24 15:59:30 +0530, Amit Kapila wrote:
> > > > and w.r.t performance it can lead extra
> > > > function call, few checks and I think in some cases even can
> > > > acquire/release spinlock.
> > >
> > > I fail to see how that could be the case.
> >
> > Won't it happen incase first backend sets releaseOK to true and another
> > backend which tries to wakeup waiters on lock will acquire spinlock
> > and tries to release the waiters.
>
> Sure, that can happen.
> > > And again, this is code that's
> > > only executed around a couple syscalls. And the cacheline will be
> > > touched around there *anyway*.
> >
> > Sure, but I think syscalls are required in case we need to wake any
> > waiter.
>
> It won't wake up a waiter if there's none on the list.

Yeap, but still it will acquire/release spinlock.

> > > > > And it'd be a pretty pointless
> > > > > behaviour, leading to useless increased contention. The only time
it'd
> > > > > make sense for X to be woken up is when it gets run faster than
the S
> > > > > processes.
> > > >
> > > > Do we get any major benefit by changing the logic of waking up
waiters?
> > >
> > > Yes.
> >
> > I think one downside I could see of new strategy is that the chance of
> > Exclusive waiter to take more time before getting woked up is increased
> > as now it will by pass Exclusive waiters in queue.
>
> Note that that *already* happens for any *new* shared locker that comes
> in. It doesn't really make sense to have share lockers queued behind the
> exclusive locker if others just go in front of it anyway.

Yeah, but I think it is difficult to avoid that behaviour as even when it
wakes
Exclusive locker, some new shared locker can comes in and acquire the
lock before Exclusive locker.

I think it is difficult to say what is the best waking strategy, as
priority for
Exclusive lockers is not clearly defined incase of LWLocks.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Ali Akbar
>
> I fixed small issue in regress tests and I enhanced tests for varlena
> types and null values.

Thanks.

it is about 15% faster than original implementation.

15% faster than array_agg(scalar)? I haven't verify the performance, but
because the internal array data and null bitmap is copied as-is, that will
be faster.

2014-10-25 1:51 GMT+07:00 Pavel Stehule :

> Hi Ali
>

> I checked a code. I am thinking so code organization is not good.
> accumArrayResult is too long now. makeMdArrayResult will not work, when
> arrays was joined (it is not consistent now). I don't like a usage of
> state->is_array_accum in array_userfunc.c -- it is signal of wrong
> wrapping.
>
Yes, i was thinking the same. Attached WIP patch to reorganizate the code.
makeMdArrayResult works now, with supplied arguments act as override from
default values calculated from ArrayBuildStateArray.

In array_userfunc.c, because we don't want to override ndims, dims and lbs,
i copied array_agg_finalfn and only change the call to makeMdArrayResult
(we don't uses makeArrayResult because we want to set release to false).
Another alternative is to create new makeArrayResult-like function that has
parameter bool release.


> next question: there is function array_append(anyarray, anyelement). Isn't
> time to define array_append(anyarray, anyarray) now?
>

There is array_cat(anyarray, anyarray):
/*-
 * array_cat :
 * concatenate two nD arrays to form an nD array, or
 * push an (n-1)D array onto the end of an nD array
 *
 */

Regards,
--
Ali Akbar
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..f59738a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12046,6 +12046,22 @@ NULL baz(3 rows)
  
   

+array_agg
+   
+   array_agg(anyarray)
+  
+  
+   any
+  
+  
+   the same array type as input type
+  
+  input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input.
+ 
+
+ 
+  
+   
 average


diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 2f0680f..8c182a4 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
  array
 ---
  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+
+SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+ array
+---
+ {{1},{2},{3},{4},{5}}
 (1 row)
 
The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 41e973b..0261fcb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -108,12 +108,16 @@ exprType(const Node *expr)
 	type = exprType((Node *) tent->expr);
 	if (sublink->subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg("could not find array type for data type %s",
-			format_type_be(exprType((Node *) tent->expr);
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("could not find array type for data type %s",
+format_type_be(exprType((Node *) tent->expr);
+		}
 	}
 }
 else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
@@ -139,12 +143,16 @@ exprType(const Node *expr)
 	type = subplan->firstColType;
 	if (subplan->subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg("could not find array type for data type %s",
-	format_type_be(subplan->firstColType;
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("could not find array type for data type %s",
+		format_type_be(subplan->firstColType;
+		}
 	}
 }
 else if (subplan->subLinkType == MULTIEXPR_SUBLINK)
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 3e7dc85..8fc8b49 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -668,10 +668,16 @@ build_subplan(Pla

Re: [HACKERS] How ugly would this be? (ALTER DATABASE)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 7:37 PM, Jim Nasby  wrote:
> ISTM that the multiple-databases-per-backend issue is the huge hang-up here.
> Maybe there's some way that could be hacked around if you're just
> re-jiggering a bunch of catalog stuff (assuming you lock users out of both
> databases while you're doing that), but if you were going to go to that
> extent perhaps it'd be better to just support cross-database access in a
> single backend...

Good luck with that.  It's probably as hard or harder than making the
backend multi-threaded, which is itself harder than any project a
reasonable person will undertake any time in the forseeable future.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security barrier INSERT

2014-10-24 Thread Brightwell, Adam
Drew,


> IMHO this is nonintuitive, the intuitive behavior of a security_barrier
> view should be to forbid inserting rows that can’t appear in the view.
>

Isn't that what WITH CHECK OPTION is meant to accomplish?

-Adam



-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


[HACKERS] security barrier INSERT

2014-10-24 Thread Drew Crawford
Hi guys, I’m seeing some non-intuitive behavior with the new updateable 
security barrier views in 9.4.  Below is the behavior of 9.4b3:

> =# create table widget ( id integer);
> CREATE TABLE
> =# create view widget_sb WITH (security_barrier=true) AS SELECT * from widget 
> where id = 22;
> CREATE VIEW
> =# insert into widget_sb (id) values (23);
> INSERT 0 1
> =# select * from widget;
>  id 
> 
>  23
> (1 row)


I think the insert should fail, since the view can only contain widgets with id 
= 22, and the widget being inserted has id 23.  In reality, the insert to the 
behind table succeeds as shown above, although it still remains absent from the 
view:

> =# select * from widget_sb;
>  id 
> 
> (0 rows)


IMHO this is nonintuitive, the intuitive behavior of a security_barrier view 
should be to forbid inserting rows that can’t appear in the view.

Have I missed something fundamental here?

Drew

Re: [HACKERS] Comment in outfunc.c

2014-10-24 Thread Tom Lane
Tatsuo Ishii  writes:
> I saw this comment in _outQuery() in outfuncs.c:

>   WRITE_ENUM_FIELD(commandType, CmdType);
>   WRITE_ENUM_FIELD(querySource, QuerySource);
>   /* we intentionally do not print the queryId field */
>   WRITE_BOOL_FIELD(canSetTag);

> What is the resoning behind the comment?

We don't want the queryId copied when a view/rule is reloaded,
since there's no guarantee that the algorithm for computing it
is stable for the life of a PG major version.  Instead it's reset
to zero on reload, and any plugins that care can recompute it.

I suppose we could print it anyway and then ignore it on reload
but what's the point of that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Comment in outfunc.c

2014-10-24 Thread Tatsuo Ishii
Hi,

I saw this comment in _outQuery() in outfuncs.c:

WRITE_ENUM_FIELD(commandType, CmdType);
WRITE_ENUM_FIELD(querySource, QuerySource);
/* we intentionally do not print the queryId field */
WRITE_BOOL_FIELD(canSetTag);

What is the resoning behind the comment?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Comment patch for bgworker.c

2014-10-24 Thread Jim Nasby

The comment for the BackgroundWorkerSlot structure tripped me up reviewing 
Robert's background worker patch; it made it clear that you need to use a 
memory barrier before setting in_use, but normally you'd never need to worry 
about that because RegisterDynamicBackgroundWorker() handles it for you. Patch 
adds a comment to that effect.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index 85a3b3a..e81dbc1 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -55,7 +55,8 @@ slist_headBackgroundWorkerList = 
SLIST_STATIC_INIT(BackgroundWorkerList);
  * responsibility of the postmaster.  Regular backends may no longer modify it,
  * but the postmaster may examine it.  Thus, a backend initializing a slot
  * must fully initialize the slot - and insert a write memory barrier - before
- * marking it as in use.
+ * marking it as in use. Note that RegisterDynamicBackgroundWorker() handles
+ * in_use correctly for you.
  *
  * As an exception, however, even when the slot is in use, regular backends
  * may set the 'terminate' flag for a slot, telling the postmaster not

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] superuser() shortcuts

2014-10-24 Thread Jim Nasby



On 10/23/14, 6:23 PM, Alvaro Herrera wrote:

Brightwell, Adam wrote:


If we were to make it consistent and use the old wording, what do you
think about providing an "errhint" as well?

Perhaps for example in slotfuncs.c#pg_create_physical_replication_stot:

errmsg - "permission denied to create physical replication slot"
errhint - "You must be superuser or replication role to use replication slots."


Sure.


As I started looking at this, there are multiple other places where
these types of error messages occur (opclasscmds.c, user.c,
postinit.c, miscinit.c are just a few), not just around the changes in
this patch.  If we change them in one place, wouldn't it be best to
change them in the rest?  If that is the case, I'm afraid that might
distract from the purpose of this patch.  Perhaps, if we want to
change them, then that should be submitted as a separate patch?


Yeah.  I'm just saying that maybe this patch should adopt whatever
wording we agree to, not that we need to change other places.  On the
other hand, since so many other places have adopted the different
wording, maybe there's a reason for it and if so, does anybody know what
it is.  But I have to say that it does look inconsistent to me.


Keep in mind that originally the ONLY special permissions "thing" we had was rolsuper. 
Now we also have replication, maybe some others, and there's discussion of adding a special 
"backup role".

In other words, the situation is a lot more complex than it used to be. So if 
cleanup is done here, it would be best to try and account for this new stuff.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExclusiveLock on extension of relation with huge shared_buffers

2014-10-24 Thread Jim Nasby

Please don't top-post.

On 10/24/14, 3:40 AM, Borodin Vladimir wrote:

I have taken some backtraces (they are attached to the letter) of two processes 
with such command:
pid=17981; while true; do date; gdb -batch -e back /usr/pgsql-9.4/bin/postgres 
$pid; echo; echo; echo; echo; sleep 0.1; done

Process 17981 was holding the lock for a long time - http://pastie.org/9671931.
And process 13886 was waiting for lock (in different time and from different 
blocker actually but I don’t think it is really important) - 
http://pastie.org/9671939.

As I can see, 17981 is actually waiting for LWLock on BufFreelistLock in 
StrategyGetBuffer function, freelist.c:134 while holding exclusive lock on 
relation. I will try to increase NUM_BUFFER_PARTITIONS (on read-only load it 
also gave us some performance boost) and write the result in this thread.


BufFreelistLock becomes very contended when shared buffers are under a lot of 
pressure.

Here's what I believe is happening:

If RelationGetBufferForTuple() decides it needs to extend, this happens:
LockRelationForExtension(relation, ExclusiveLock);
buffer = ReadBufferBI(relation, P_NEW, bistate);

Assuming bistate is false (I didn't check the bulk case), ReadBufferBI() ends 
up at ReadBuffer_common(), which calls BufferAlloc(). In the normal case, 
BufferAlloc() won't find the necessary buffer, so it will call 
StrategyGetBuffer(), which will end up getting the freelist lock. Currently the 
free list is normally empty, which means we now need to run the clock sweep to 
find a victim buffer. The clock sweep will keep running until it finds a buffer 
that is not pinned and has usage_count = 0. If shared buffers are under heavy 
pressure, you can have a huge number of them with usage_count = 5, which for 
100GB shared buffers and an 8K BLKSZ, you could have to check buffers *52 
million* times (assuming you finally find a buffer on the start of the 5th 
loop) before you find a victim.

Keep in mind that's all happening while you're holding both the extension lock 
*and the freelist lock*, which basically means no one else in the entire system 
can allocate a new buffer.

This is one reason why a large shared_buffers setting is usually 
counter-productive. Experience with older versions is that setting it higher than 
about 8GB is more likely to hurt than to help. Newer versions are probably better, 
but I think you'll be hard-pressed to find a workload where 100GB makes sense. It 
might if your entire database fits in shared_buffers (though, even then there's 
probably a number of O(n) or worse operations that will hurt you), but if your 
database is > shared_buffers you're probably in trouble.

I suggest cutting shared_buffers *way* down. Old-school advice for this machine 
would be 8G (since 25% of 128G would be too big). You might be able to do 
better than 8G, but I recommend not even trying unless you've got a good way to 
test your performance.

If you can test performance and find an optimal setting for shared_buffers, 
please do share your test data and findings. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Peter Geoghegan
On Fri, Oct 24, 2014 at 4:39 PM, Petr Jelinek  wrote:
> If you feel so strongly that it's wrong even though everybody else seems to
> prefer it and if you at the same time feel so strongly about people changing
> minds once you implement this, maybe the best way to convince us is to show
> us the implementation (at this point it would probably have taken less of
> your time than the argument did).

No, it wouldn't have - I don't think anyone believes that. Magic
addRangeTableEntryForRelation() calls are only used in the context of
one or two utility statements that have pretty limited scope. Support
for an OLD.* style syntax would have to exist at *all* stages of query
execution, from parse analysis through to rewriting, planning, and
execution. That's the difference here - this isn't a utility command.

>> So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe
>> that's an interesting precedent. During rewriting, this gets rewritten
>> such that you end up with something that looks to the planner as if
>> the original query included a constant (this actually comes from a
>> catalog look-up for the column during rewriting). What if we spelled
>> EXCLUDING/CONFLICTING as follows:
>>
>> INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
>> EXCLUDED || 'this works' WHERE another_col != EXCLUDED;
>>
>> Then rewriting would figure these details out. From a design
>> perspective, there'd need to be a few details worked out about how
>> inference actually works - inferring *which* column the EXCLUDED
>> expression actually referred to, but it seems doable, especially given
>> the existing restrictions on the structure of the UPDATE. We're not
>> rewriting from a SetToDefault to a constant, but a SetToDefault-like
>> thing to a special Var (actually, the finished representation probably
>> makes it to the execution stage with that Var representation filled
>> in, unlike SetToDefault, but it's basically the same pattern). It
>> solves my problem with dummy range table entries. Actually, *any* new
>> kind of expression accomplishes this just as well. My concern here is
>> more around not needing cute tricks with dummy RTEs than it is around
>> being in favor of any particular expression-based syntax.
>>
>> What do you think of that?
>>
>
> Ugh, you want to auto-magically detect what value is behind the EXCLUDED
> based on how/where it's used in the UPDATE? That seems like quite a bad
> idea.

That's *exactly* how DEFAULT works within UPDATE targetlists. There
might be a few more details to work out here, but not terribly many,
and that's going to be true no matter what. 95%+ of the time, it'll
just be "val = EXCLUDED" anyway.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Petr Jelinek

On 25/10/14 00:48, Peter Geoghegan wrote:

You're conflating the user-visible syntax with the parse tree
representation in way that is utterly without foundation.  I don't
have a position at this point on which parse-analysis representation
is preferable, but it's completely wrong-headed to say that you
"can't" make NEW.x produce the same parse-analysis output that your
CONFLICTING(x) syntax would have created.  Sure, it might be harder,
but it's not that much harder, and it's definitely not an unsolvable
problem.


I don't believe I did. The broader point is that the difficulty in
making that work reflects the conceptually messiness, from
user-visible aspects down. I can work with the difficulty, and I may
even be able to live with the messiness, but I'm trying to bring the
problems with it to a head sooner rather than later for good practical
reasons. In all sincerity, my real concern is that you or the others
will change your mind when I actually go and implement an OLD.* style
syntax, and see the gory details. I regret it if to ask this is to ask
too much of you, but FYI that's the thought process behind it.



If you feel so strongly that it's wrong even though everybody else seems 
to prefer it and if you at the same time feel so strongly about people 
changing minds once you implement this, maybe the best way to convince 
us is to show us the implementation (at this point it would probably 
have taken less of your time than the argument did).




So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe
that's an interesting precedent. During rewriting, this gets rewritten
such that you end up with something that looks to the planner as if
the original query included a constant (this actually comes from a
catalog look-up for the column during rewriting). What if we spelled
EXCLUDING/CONFLICTING as follows:

INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
EXCLUDED || 'this works' WHERE another_col != EXCLUDED;

Then rewriting would figure these details out. From a design
perspective, there'd need to be a few details worked out about how
inference actually works - inferring *which* column the EXCLUDED
expression actually referred to, but it seems doable, especially given
the existing restrictions on the structure of the UPDATE. We're not
rewriting from a SetToDefault to a constant, but a SetToDefault-like
thing to a special Var (actually, the finished representation probably
makes it to the execution stage with that Var representation filled
in, unlike SetToDefault, but it's basically the same pattern). It
solves my problem with dummy range table entries. Actually, *any* new
kind of expression accomplishes this just as well. My concern here is
more around not needing cute tricks with dummy RTEs than it is around
being in favor of any particular expression-based syntax.

What do you think of that?



Ugh, you want to auto-magically detect what value is behind the EXCLUDED 
based on how/where it's used in the UPDATE? That seems like quite a bad 
idea.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How ugly would this be? (ALTER DATABASE)

2014-10-24 Thread Jim Nasby

On 10/24/14, 1:28 PM, Robert Haas wrote:

On Fri, Oct 24, 2014 at 2:06 PM, Joshua D. Drake  wrote:

One of the things we run into quite a bit is customers who are using
multiple databases when they should be using multiple schemas. I am sure
other consultants run into this as well. This gets even more difficult as
uptime requirements have become all but 100%. So my question is, what would
this take?

ALTER DATABASE foo LOCATION DATABASE bar SCHEMA baz?

Where if we execute that command, database foo would move to schema baz
within database bar?

I am fully aware of what it takes on the client side but structurally within
postgres what would it take? Is it even reasonable?


What if the database contains more than one schema?

You could perhaps try to create a command that would move a schema
between two databases in the same cluster.  It's fraught with
practical difficulties because a single backend can't be connected to
both databases at the same time, so how exactly do you make the
required catalog changes all in a single transaction?  But if you
imagine that you have an infinite pool of top-notch PostgreSQL talent
with unbounded time to work on this problem and no other, I bet
somebody could engineer a solution.


ISTM that the multiple-databases-per-backend issue is the huge hang-up here. 
Maybe there's some way that could be hacked around if you're just re-jiggering 
a bunch of catalog stuff (assuming you lock users out of both databases while 
you're doing that), but if you were going to go to that extent perhaps it'd be 
better to just support cross-database access in a single backend...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Petr Jelinek

On 24/10/14 23:07, Robert Haas wrote:

On Fri, Oct 24, 2014 at 4:53 PM, Jim Nasby  wrote:

The only case I can think of would be actually connecting to a remote
database; in that case would we even want something as raw as this? I
suspect not, in which case I don't see an issue. On the other hand, if we
ever think we might want to do that, we should probably at least stick a
version number field in there...

But my suspicion is if we ever wanted to do something more with this then
we'd want some kind of text-based format that could be passed into a SQL
command (ie: SET ENVIRONMENT TO blah;)


I mean, I don't think this is much different than what we're already
doing to transfer variables from the postmaster to other backends in
EXEC_BACKEND builds; see write_nondefault_variables().  It doesn't
have a version number or anything either.  I don't see a reason why
this code needs to be held to a different standard; the purpose is
fundamentally quite similar.



I really do think that the GUC patch is ok as it is, we might want to do 
1/0 for bools but I don't really see that as important thing. And it 
works for the use-cases it's intended for. I don't see why this should 
be required to work cross version really.
Loading of the modules by bgworker is probably bigger issue than just 
GUCs, and I think it's out of scope for the current patch(set).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Jim Nasby

On 10/24/14, 4:03 PM, Robert Haas wrote:

On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby  wrote:

On 10/24/14, 12:21 PM, Robert Haas wrote:

- What should we call dsm_unkeep_mapping, if not that?


Only option I can think of beyond unkeep would be
dsm_(un)register_keep_mapping. Dunno that it's worth it.


Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
since it's arranging to keep it by unregistering it from the resource
owner.  And then we could call the new function
dsm_register_mapping().  That has the appeal that "unregister" is a
word, whereas "unkeep" isn't, but it's a little confusing otherwise,
because the sense is reversed vs. the current naming.  Or we could
just leave dsm_keep_mapping() alone and call the new function
dsm_register_mapping().  A little non-orthogonal, but I think it'd be
OK.


I think it's actually important to keep the names matching, even if it means 
"unkeep".

dsm_unregister_mapping seems like the opposite of what you'd want to do to have 
the mapping be remembered. I get that it works that way because of how it's 
implemented, but that seems like a lot of leakage of implementation into API...

FWIW, I don't see "unkeep" as being that horrible.


- Does anyone have a tangible suggestion for how to reduce the code
duplication in patch #6?


Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in
exec_simple that's not safe for bgwriter? I'm not seeing why we can't use
exec_simple. :/


There are some differences if you compare them closely.


Without doing a deep dive, I'm guessing that most of the extra stuff would be 
safe to re-use; it just wouldn't affect execute_sql_string. Obviously we could 
add a boolean to exec_simple_query for the case when it's being used by a 
bgwriter. Though, it seems like the biggest differences have to do with logging

Here's the differences I see:

- Disallowing transaction commands
- Logging
- What memory context is used (could we just use that differently in a 
pg_backend backend?)
- Portal output format
- What to do with the output of intermediate commands (surely there's other 
places we need to handle that? plpgsql maybe?)

I think all of those except logging could be handled by a function serving both 
exec_simple_query and execute_sql_command that accepts a few booleans (or maybe 
just one to indicate the caller) and some if's. At least I don't think it'd be 
too bad, without actually writing it.

I'm not sure what to do about logging. If the original backend has logging 
turned on, is it that horrible to do the same logging as exec_simple_query 
would do?

Another option would be factoring out parts of exec_simple_query; the for loop 
over the parse tree might be a good candidate. But I suspect that would be 
uglier than a separate support function.

I do feel uncomfortable with the amount of duplication there is right now 
though...


BTW, it does occur to me that we could do something to combine
AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo().


Meh.


Well, it does seem to be a fairly common pattern throughout the codebase. And 
you were pretty vague when you asked for ideas to reduce duplication. ;P


+   default:
+   elog(WARNING, "unknown message type: %c (%zu
bytes)",
+msg.data[0], nbytes);
It'd be useful to also provide DEBUG output with the message itself...


I think that's more work than is justified.  We'd have to hexdump it
or something because it might not be valid in the client encoding, and
it's a case that shouldn't happen anyway.


I'm worried that a user wouldn't have any good way to see what the unexpected 
data is... but I guess if a user ever saw this we've got bigger problems 
anyway...


(Hmm, that reminds me that I haven't thought of a solution to the
problem that the background worker might try to SET client_encoding,
which would be bad.  Not sure what the best fix for that is, off-hand.
I'd rather not prohibit ALL GUC changes in the background worker, as
there's no good reason for such a draconian restriction.)


Perhaps a new GucContext for background workers? Or maybe a new GucSource, 
though I'm not sure if that'd work and it seems pretty wrong anyway.

BTW, perhaps the amount of discussion on internal parts is an indication this 
shouldn't be in contrib. I think it's a damn strong indication it shouldn't be 
in PGXN. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Petr Jelinek

On 24/10/14 23:03, Robert Haas wrote:

On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby  wrote:

On 10/24/14, 12:21 PM, Robert Haas wrote:

- What should we call dsm_unkeep_mapping, if not that?


Only option I can think of beyond unkeep would be
dsm_(un)register_keep_mapping. Dunno that it's worth it.


Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
since it's arranging to keep it by unregistering it from the resource
owner.  And then we could call the new function
dsm_register_mapping().  That has the appeal that "unregister" is a
word, whereas "unkeep" isn't, but it's a little confusing otherwise,
because the sense is reversed vs. the current naming.  Or we could
just leave dsm_keep_mapping() alone and call the new function
dsm_register_mapping().  A little non-orthogonal, but I think it'd be
OK.



I don't like that too much, but I don't have better suggestion, if we 
went with one of these, I would prefer taking the route of renaming the 
dsm_keep_mapping to dsm_unregister_mapping.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Peter Geoghegan
On Fri, Oct 24, 2014 at 1:07 PM, Robert Haas  wrote:
> The problem here isn't that I haven't read your emails.  I have read
> them all, including that one.  Moreover, this isn't the first time
> you've asserted that someone hasn't read one of your emails.  So if
> we're going to say what we each think would be helpful, then I think
> it would be helpful if you stopped accusing the people who are taking
> time to provide feedback on your work of having failed to read your
> emails.  It's possible that there may be instances where that problem
> exists, but it beggars credulity to suppose that the repeated
> unanimous consensus against some of your design decisions is entirely
> an artifact of failure to pay attention.

Okay.

> The fact is, I don't feel obliged to respond to every one of your
> emails, just as you don't respond to every one of mine.  If you want
> this patch to ever get committed, it's your job to push it forward -
> not mine, not Simon's, and not Heikki's.  Sometimes that means you
> have to solve hard problems instead of just articulating what they
> are.

Okay.

> You're conflating the user-visible syntax with the parse tree
> representation in way that is utterly without foundation.  I don't
> have a position at this point on which parse-analysis representation
> is preferable, but it's completely wrong-headed to say that you
> "can't" make NEW.x produce the same parse-analysis output that your
> CONFLICTING(x) syntax would have created.  Sure, it might be harder,
> but it's not that much harder, and it's definitely not an unsolvable
> problem.

I don't believe I did. The broader point is that the difficulty in
making that work reflects the conceptually messiness, from
user-visible aspects down. I can work with the difficulty, and I may
even be able to live with the messiness, but I'm trying to bring the
problems with it to a head sooner rather than later for good practical
reasons. In all sincerity, my real concern is that you or the others
will change your mind when I actually go and implement an OLD.* style
syntax, and see the gory details. I regret it if to ask this is to ask
too much of you, but FYI that's the thought process behind it.

> I do acknowledge that there might be a better syntax out there than
> NEW.x and OLD.x.  I have not seen one proposed that I like better.
> Feel free to propose something.  But don't keep re-proposing something
> that LOOKSLIKE(this) because nobody - other than perhaps you - likes
> that.

Okay.

So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe
that's an interesting precedent. During rewriting, this gets rewritten
such that you end up with something that looks to the planner as if
the original query included a constant (this actually comes from a
catalog look-up for the column during rewriting). What if we spelled
EXCLUDING/CONFLICTING as follows:

INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
EXCLUDED || 'this works' WHERE another_col != EXCLUDED;

Then rewriting would figure these details out. From a design
perspective, there'd need to be a few details worked out about how
inference actually works - inferring *which* column the EXCLUDED
expression actually referred to, but it seems doable, especially given
the existing restrictions on the structure of the UPDATE. We're not
rewriting from a SetToDefault to a constant, but a SetToDefault-like
thing to a special Var (actually, the finished representation probably
makes it to the execution stage with that Var representation filled
in, unlike SetToDefault, but it's basically the same pattern). It
solves my problem with dummy range table entries. Actually, *any* new
kind of expression accomplishes this just as well. My concern here is
more around not needing cute tricks with dummy RTEs than it is around
being in favor of any particular expression-based syntax.

What do you think of that?

> And don't use the difficulty of parse analysis as a
> justification for your proposed syntax, because, except in extreme
> cases, there are going to be very few if any regular contributors to
> this mailing list who will accept that as a justification for one
> syntax over another.  Syntax needs to be justified by being beautiful,
> elegant, precedent-driven, orthogonal, and minimalist - not by whether
> you might need an extra 25-75 lines of parse analysis code to make it
> work.

Well, that isn't the case here. I'd much rather code my way out of a
disagreement with you.

 Unique index inference (i.e. the way we figure out  *which* unique
 index to use) occurs during parse analysis. I think it would be
 inappropriate, and certainly inconvenient to do it during planning.
>>>
>>> You're wrong.  The choice of which index to use is clearly wildly
>>> inappropriately placed in the parse analysis phase, and if you think
>>> that has any chance of ever being committed by anyone, then you are
>>> presuming the existence of a committer who won't mind ignoring the
>>

Re: [HACKERS] Question about RI checks

2014-10-24 Thread Florian Pflug
On Oct24, 2014, at 22:50 , Kevin Grittner  wrote:
> I need to spend some more time looking at it, and I have another
> couple things in front of this on my personal TODO list, but I
> think that if we had a row lock which was stronger than current
> SELECT FOR UPDATE behavior, and the delete of a parent row (or
> updated of its referenced key) read the children using it, this
> would be solved.  Essentially I'm thinking of something just like
> FOR UPDATE except that a transaction which attempted a concurrent
> UPDATE or DELETE of the row (before the locking transaction ended)
> would error out with some form of serialization failure.  I believe
> this would be consistent with the behavior of SELECT FOR UPDATE in
> Oracle, so it would allow a path for those migrating from Oracle to
> maintain their current logic (with a slight change to the FOR
> strength clause).

Hm, I don't believe any form of traditional row-level lock on the
child row can fix this. If you look at the second failure case
(an updated isolation tester spec file which includes comments is
attached), you see that it fails even if you define the FK constraint
as CASCADE instead of RESTRICT. So even a full UPDATE or DELETE
of the child rows doesn't help.

But maybe I miss-understood what you proposed.

best regards,
Florian Pflug



fk-consistency2.spec
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo fixes for pg_recvlogical documentation

2014-10-24 Thread Michael Paquier
On Fri, Oct 24, 2014 at 7:42 PM, Robert Haas  wrote:
> [rhaas pgsql]$ patch -p1 < ~/Downloads/20141023_pg_recvlogical_fixes.patch
> patching file doc/src/sgml/ref/pg_recvlogical.sgml
> Hunk #1 succeeded at 270 with fuzz 1 (offset 165 lines).
> Hunk #2 FAILED at 282.
> Hunk #3 FAILED at 295.
> Hunk #4 FAILED at 308.
> 3 out of 4 hunks FAILED -- saving rejects to file
> doc/src/sgml/ref/pg_recvlogical.sgml.rej
Sorry, I did not rebase much these days, there were conflicts with 52c1ae2...
-- 
Michael
*** a/doc/src/sgml/ref/pg_recvlogical.sgml
--- b/doc/src/sgml/ref/pg_recvlogical.sgml
***
*** 230,236  PostgreSQL documentation
  

 -d database
!--dbname database
 
  
   The database to connect to.  See the description of the actions for
--- 230,236 
  

 -d database
!--dbname=database
 
  
   The database to connect to.  See the description of the actions for
***
*** 243,249  PostgreSQL documentation
  

 -h hostname-or-ip
!--host hostname-or-ip
 
  
   Specifies the host name of the machine on which the server is
--- 243,249 
  

 -h hostname-or-ip
!--host=hostname-or-ip
 
  
   Specifies the host name of the machine on which the server is
***
*** 257,263  PostgreSQL documentation
  

 -p port
!--port port
 
  
   Specifies the TCP port or local Unix domain socket file
--- 257,263 
  

 -p port
!--port=port
 
  
   Specifies the TCP port or local Unix domain socket file
***
*** 270,276  PostgreSQL documentation
  

 -U user
!--username user
 
  
   Username to connect as.  Defaults to current operating system user
--- 270,276 
  

 -U user
!--username=user
 
  
   Username to connect as.  Defaults to current operating system user

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 4:53 PM, Jim Nasby  wrote:
> The only case I can think of would be actually connecting to a remote
> database; in that case would we even want something as raw as this? I
> suspect not, in which case I don't see an issue. On the other hand, if we
> ever think we might want to do that, we should probably at least stick a
> version number field in there...
>
> But my suspicion is if we ever wanted to do something more with this then
> we'd want some kind of text-based format that could be passed into a SQL
> command (ie: SET ENVIRONMENT TO blah;)

I mean, I don't think this is much different than what we're already
doing to transfer variables from the postmaster to other backends in
EXEC_BACKEND builds; see write_nondefault_variables().  It doesn't
have a version number or anything either.  I don't see a reason why
this code needs to be held to a different standard; the purpose is
fundamentally quite similar.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby  wrote:
> On 10/24/14, 12:21 PM, Robert Haas wrote:
>> - What should we call dsm_unkeep_mapping, if not that?
>
> Only option I can think of beyond unkeep would be
> dsm_(un)register_keep_mapping. Dunno that it's worth it.

Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
since it's arranging to keep it by unregistering it from the resource
owner.  And then we could call the new function
dsm_register_mapping().  That has the appeal that "unregister" is a
word, whereas "unkeep" isn't, but it's a little confusing otherwise,
because the sense is reversed vs. the current naming.  Or we could
just leave dsm_keep_mapping() alone and call the new function
dsm_register_mapping().  A little non-orthogonal, but I think it'd be
OK.

>> - Does anyone have a tangible suggestion for how to reduce the code
>> duplication in patch #6?
>
> Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in
> exec_simple that's not safe for bgwriter? I'm not seeing why we can't use
> exec_simple. :/

There are some differences if you compare them closely.

> BTW, it does occur to me that we could do something to combine
> AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo().

Meh.

> pg_background_result()
> +   dsm_unkeep_mapping(info->seg);
> +
> +   /* Set up tuple-descriptor based on colum definition list.
> */
> +   if (get_call_result_type(fcinfo, NULL, &tupdesc) !=
> TYPEFUNC_COMPOSITE)
> +   ereport(ERROR,
> Is there a reason we can't check the result type before unkeeping? Seems
> silly to throw the results away just because someone flubbed a function
> call...

Hmm, yeah, I see no reason why we couldn't move that up higher in the
function.  It's probably a pretty common failure mode, too, so I can
see the convenience factor there.

> +   default:
> +   elog(WARNING, "unknown message type: %c (%zu
> bytes)",
> +msg.data[0], nbytes);
> It'd be useful to also provide DEBUG output with the message itself...

I think that's more work than is justified.  We'd have to hexdump it
or something because it might not be valid in the client encoding, and
it's a case that shouldn't happen anyway.

(Hmm, that reminds me that I haven't thought of a solution to the
problem that the background worker might try to SET client_encoding,
which would be bad.  Not sure what the best fix for that is, off-hand.
I'd rather not prohibit ALL GUC changes in the background worker, as
there's no good reason for such a draconian restriction.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Jim Nasby

On 10/24/14, 3:26 PM, Robert Haas wrote:

On Fri, Oct 24, 2014 at 3:30 PM, Jim Nasby  wrote:

On 10/24/14, 2:23 PM, Jim Nasby wrote:

On the serialization structure itself, should we be worried about a
mismatch between available GUCs on the sender vs the receiver? Presumably if
the sender outputs a GUC that the receiver doesn't know about we'll get an
error, but what if the sender didn't include something? Maybe not an issue
today, but could this cause problems down the road if we wanted to use the
serialized data some other way? But maybe I'm just being paranoid. :)


I just realized there's a bigger problem there; this isn't portable against
any changes to any of the binary elements.

So I guess it's really a question of would we ever want this to function
(as-is) cross-version.


I think that would be pretty hard to make work, but I don't mind if
someone else wants to try for some use case that they want to meet.
My goal is to make parallel query work, so the data will just be
getting transferred between two simultaneously-running children of the
same postmaster.


The only case I can think of would be actually connecting to a remote database; 
in that case would we even want something as raw as this? I suspect not, in 
which case I don't see an issue. On the other hand, if we ever think we might 
want to do that, we should probably at least stick a version number field in 
there...

But my suspicion is if we ever wanted to do something more with this then we'd 
want some kind of text-based format that could be passed into a SQL command 
(ie: SET ENVIRONMENT TO blah;)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Kevin Grittner
Florian Pflug  wrote:
> On Oct24, 2014, at 20:24 , Robert Haas  wrote:
>> On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug  wrote:
 What about doing one scan using SnapshotAny and then testing each
 returned row for visibility under both relevant snapshots?  See
 whether there is any tuple for which they disagree.
>>>
>>> See my other mail - testing whether the snapshots agree isn't enough,
>>> you'd have to check whether there could have been *any* snapshot taken
>>> between the two which would see a different result.
>>
>> Oh, hmm.  I had thought what I was proposing was strong enough to
>> handle that case, but now I see that it isn't.  However, I'm not
>> entirely sure that it's the RI code's job to prevent such cases, or at
>> least not when the transaction isolation level is less than
>> serializable.
>
> Well, the responsibility was laid onto the RI code by the decision to
> not do SIREAD locking for RI enforcement queries. Simply enabling SIREAD
> locking without doing anything else is not a good solution, I think -
> it will drive up the false-positive rate. And it would make workloads
> consisting purely of serializable transactions pay the price for the
> row-level locking done by the RI triggers, without getting anything in
> return.

Yeah, it would be far better to fix RI behavior if we can;
otherwise those using serializable transactions to protect data
integrity essentially pay the price for concurrency control around
foreign keys twice -- once using blocking locks and again using
non-blocking SIRead locks just to cover some odd corner cases.

I need to spend some more time looking at it, and I have another
couple things in front of this on my personal TODO list, but I
think that if we had a row lock which was stronger than current
SELECT FOR UPDATE behavior, and the delete of a parent row (or
updated of its referenced key) read the children using it, this
would be solved.  Essentially I'm thinking of something just like
FOR UPDATE except that a transaction which attempted a concurrent
UPDATE or DELETE of the row (before the locking transaction ended)
would error out with some form of serialization failure.  I believe
this would be consistent with the behavior of SELECT FOR UPDATE in
Oracle, so it would allow a path for those migrating from Oracle to
maintain their current logic (with a slight change to the FOR
strength clause).

>> Is there an argument that the anomaly that results is
>> unacceptable at REPEATABLE READ?
>
> I'm not aware of any case that is clearly unacceptable for REPEATABLE
> READ, but neither am I convinced that no such case exists. REPEATABLE READ
> mode is hard because there doesn't seem to be any formal definition of
> what we do and do not guarantee. The guarantees provided by the SQL
> standard seem to be so much weaker than what we offer that they're not
> really helpful :-(
>
> I believe the best way forward is to first find a solution for SERIALIZABLE
> transactions, and then check if it can be applied to REPEATABLE READ
> mode too. For SERIALIZABLE mode, it's at least clear what we're aiming
> for -- offering true serializability.

What I outline above would clearly work for both.  Arguably we
could just use FOR UPDATE in REPEATABLE READ and only use the new,
stronger lock for SERIALIZABLE.  In fact, that seems to me to be
the appropriate course.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Jim Nasby

On 10/24/14, 12:21 PM, Robert Haas wrote:

- What should we call dsm_unkeep_mapping, if not that?


Only option I can think of beyond unkeep would be 
dsm_(un)register_keep_mapping. Dunno that it's worth it.


- Does anyone have a tangible suggestion for how to reduce the code
duplication in patch #6?

Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in 
exec_simple that's not safe for bgwriter? I'm not seeing why we can't use 
exec_simple. :/

BTW, it does occur to me that we could do something to combine 
AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo().

pg_background_result()
+   dsm_unkeep_mapping(info->seg);
+
+   /* Set up tuple-descriptor based on colum definition list. */
+   if (get_call_result_type(fcinfo, NULL, &tupdesc) != 
TYPEFUNC_COMPOSITE)
+   ereport(ERROR,
Is there a reason we can't check the result type before unkeeping? Seems silly 
to throw the results away just because someone flubbed a function call...

+   default:
+   elog(WARNING, "unknown message type: %c (%zu 
bytes)",
+msg.data[0], nbytes);
It'd be useful to also provide DEBUG output with the message itself...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 2:58 PM, Florian Pflug  wrote:
> I believe the best way forward is to first find a solution for SERIALIZABLE
> transactions, and then check if it can be applied to REPEATABLE READ
> mode too. For SERIALIZABLE mode, it's at least clear what we're aiming
> for -- offering true serializability.

That sounds like a reasonable plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 3:30 PM, Jim Nasby  wrote:
> On 10/24/14, 2:23 PM, Jim Nasby wrote:
>> On the serialization structure itself, should we be worried about a
>> mismatch between available GUCs on the sender vs the receiver? Presumably if
>> the sender outputs a GUC that the receiver doesn't know about we'll get an
>> error, but what if the sender didn't include something? Maybe not an issue
>> today, but could this cause problems down the road if we wanted to use the
>> serialized data some other way? But maybe I'm just being paranoid. :)
>
> I just realized there's a bigger problem there; this isn't portable against
> any changes to any of the binary elements.
>
> So I guess it's really a question of would we ever want this to function
> (as-is) cross-version.

I think that would be pretty hard to make work, but I don't mind if
someone else wants to try for some use case that they want to meet.
My goal is to make parallel query work, so the data will just be
getting transferred between two simultaneously-running children of the
same postmaster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 3:23 PM, Jim Nasby  wrote:
> Here's a review of patch 4.

Thanks!

> Perhaps it would be good to document the serialization format. I at least
> would have found it helpful, especially when reading
> estimate_variable_size(). I can take a stab at that if you'd rather not be
> bothered.

I find the existing code pretty clear: to my eyes,
serialize_variable() really couldn't be much more transparent.  But if
you want to suggest something I'm all ears.

> estimate_variable_size():
> +   valsize = 5;/* max(strlen('true'),
> strlen('false')) */
> ...
> +   if (abs(*conf->variable) < 1000)
> +   valsize = 3 + 1;
>
> If we're worried about size, perhaps we should use 1/0 for bool instead of
> true/false?

I guess we could.  I'm not particularly worried about size, myself.
We need all the rigamarole about estimating the amount of space needed
because we can't allocate more space once we begin serializing.  If
the DSM where this data is getting stored turns out not to be big
enough, we can't resize it.  So we need to *know* how big the data is
going to be, but I don't think we need to *care* very much.

> On the serialization structure itself, should we be worried about a mismatch
> between available GUCs on the sender vs the receiver? Presumably if the
> sender outputs a GUC that the receiver doesn't know about we'll get an
> error, but what if the sender didn't include something? Maybe not an issue
> today, but could this cause problems down the road if we wanted to use the
> serialized data some other way? But maybe I'm just being paranoid. :)

I think the principal danger here is that there could be some loadable
module which is present in one backend but not the other, or which (in
its inimitable wisdom) chose to register a different set of GUCs or
valid values for those GUCs in one backend than the other.  That would
indeed be sad.  Given that our rules for registering GUCs resemble
nothing so much as the wild west, I don't believe there's any ironclad
way to defend against it, either.

One thing we could do - not in this patch - is provide a mechanism
that would cause a background worker to load every loadable module
that's present in the launching worker.  I'm a little reluctant to
spend time on that now.  If pg_background doesn't work with certain
combinations of loadable modules and GUC settings... it's not a
catastrophe.  It might be more of an issue for parallel query proper,
because expectations may be higher there, but if so it won't hurt to
have some experience seeing whether things go wrong with this simple
system, and in what manner.  At least IMHO, it doesn't seem like a
good idea to put fiddling with the edges of this ahead of other
parallelism-related projects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 3:01 PM, Peter Geoghegan  wrote:
> This is the situation with unique index inference all over again. You
> don't like a function-like expression. Okay. It would be a lot more
> helpful if instead of just criticizing, you *also* looked at my
> reasons for not wanting to go with something that would necessitate
> adding a dummy range table entry [1].

The problem here isn't that I haven't read your emails.  I have read
them all, including that one.  Moreover, this isn't the first time
you've asserted that someone hasn't read one of your emails.  So if
we're going to say what we each think would be helpful, then I think
it would be helpful if you stopped accusing the people who are taking
time to provide feedback on your work of having failed to read your
emails.  It's possible that there may be instances where that problem
exists, but it beggars credulity to suppose that the repeated
unanimous consensus against some of your design decisions is entirely
an artifact of failure to pay attention.

The fact is, I don't feel obliged to respond to every one of your
emails, just as you don't respond to every one of mine.  If you want
this patch to ever get committed, it's your job to push it forward -
not mine, not Simon's, and not Heikki's.  Sometimes that means you
have to solve hard problems instead of just articulating what they
are.

> There are some very good
> practical reasons for avoiding that. We only do that (the OLD.* and
> NEW.* stuff) in the context of CREATE RULE and one or two other
> things. We don't play any games like that during parse analysis of
> ordinary optimizable statements, which is what this is. And those
> dummy RTEs are always placeholders, AFAICT. Apart from those technical
> reasons, the two situations just aren't comparable from a user-visible
> perspective, but that isn't the main problem.
>
> I don't particularly expect you to come around to my view here. But it
> would be nice to have the problems with dummy RTEs and so on
> acknowledged, so it's understood that that is in itself a strange new
> direction for parse analysis of ordinary optimizable statements.

You're conflating the user-visible syntax with the parse tree
representation in way that is utterly without foundation.  I don't
have a position at this point on which parse-analysis representation
is preferable, but it's completely wrong-headed to say that you
"can't" make NEW.x produce the same parse-analysis output that your
CONFLICTING(x) syntax would have created.  Sure, it might be harder,
but it's not that much harder, and it's definitely not an unsolvable
problem.

I do acknowledge that there might be a better syntax out there than
NEW.x and OLD.x.  I have not seen one proposed that I like better.
Feel free to propose something.  But don't keep re-proposing something
that LOOKSLIKE(this) because nobody - other than perhaps you - likes
that.  And don't use the difficulty of parse analysis as a
justification for your proposed syntax, because, except in extreme
cases, there are going to be very few if any regular contributors to
this mailing list who will accept that as a justification for one
syntax over another.  Syntax needs to be justified by being beautiful,
elegant, precedent-driven, orthogonal, and minimalist - not by whether
you might need an extra 25-75 lines of parse analysis code to make it
work.

>>> Unique index inference (i.e. the way we figure out  *which* unique
>>> index to use) occurs during parse analysis. I think it would be
>>> inappropriate, and certainly inconvenient to do it during planning.
>>
>> You're wrong.  The choice of which index to use is clearly wildly
>> inappropriately placed in the parse analysis phase, and if you think
>> that has any chance of ever being committed by anyone, then you are
>> presuming the existence of a committer who won't mind ignoring the
>> almost-immediate revert request that would draw from, at the very
>> least, Tom.
>
> Why? This has nothing to do with optimization.

That is false.  If there is more than one index that could be used,
the system should select the best one.  That is an optimization
decision per se.  Also, if a plan is saved - in the plancache, say, or
in a view - the query can be re-planned if the index it depends on is
dropped, but there's no way to do parse analysis.

>> As much as I'd like to have this feature, your refusal to change
>> anything except when asked at least three times each by about five
>> different people makes this effort barely worth pursuing.  You can say
>> all you like that you're receptive to feedback, but multiple people
>> here are telling you that they feel otherwise.
>
> I'm tired of your assertions about why I haven't done certain things.
> It's not fair. I have incorporated a lot of feedback into V1.3 (and
> previous versions), which isn't acknowledged. Also, I moved to the USA
> this week, and things have been hectic. As I said in relation to
> unique index inference, please consider that

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Jim Nasby

On 10/24/14, 2:23 PM, Jim Nasby wrote:

On the serialization structure itself, should we be worried about a mismatch 
between available GUCs on the sender vs the receiver? Presumably if the sender 
outputs a GUC that the receiver doesn't know about we'll get an error, but what 
if the sender didn't include something? Maybe not an issue today, but could 
this cause problems down the road if we wanted to use the serialized data some 
other way? But maybe I'm just being paranoid. :)


I just realized there's a bigger problem there; this isn't portable against any 
changes to any of the binary elements.

So I guess it's really a question of would we ever want this to function 
(as-is) cross-version.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Jim Nasby

On 10/24/14, 12:21 PM, Robert Haas wrote:

On Fri, Oct 24, 2014 at 1:13 PM, Jim Nasby  wrote:

It's a valid concern, but I think the way to handle it if needed is to limit
the number of connections a user can open. Or perhaps another option would
be to change the permissions on the related functions (do we check ACLs for
internal functions?)


I'm not sure dump-and-restore would preserve any properties of
anything in pg_catalog.

Anyway, I think we're getting a bit ahead of ourselves here.  The
questions I need answers to right now are:

- What should we call dsm_unkeep_mapping, if not that?
- Are there remaining complaints about patch #3?
- How can I get somebody to review patch #4?


Here's a review of patch 4.

Perhaps it would be good to document the serialization format. I at least would 
have found it helpful, especially when reading estimate_variable_size(). I can 
take a stab at that if you'd rather not be bothered.

estimate_variable_size():
+   valsize = 5;/* max(strlen('true'), 
strlen('false')) */
...
+   if (abs(*conf->variable) < 1000)
+   valsize = 3 + 1;

If we're worried about size, perhaps we should use 1/0 for bool instead of 
true/false?

On the serialization structure itself, should we be worried about a mismatch 
between available GUCs on the sender vs the receiver? Presumably if the sender 
outputs a GUC that the receiver doesn't know about we'll get an error, but what 
if the sender didn't include something? Maybe not an issue today, but could 
this cause problems down the road if we wanted to use the serialized data some 
other way? But maybe I'm just being paranoid. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Peter Geoghegan
On Fri, Oct 24, 2014 at 10:04 AM, Robert Haas  wrote:
> On Thu, Oct 23, 2014 at 9:43 PM, Peter Geoghegan  wrote:
>> * CONFLICTING() is renamed to EXCLUDED(). I know that some people
>> wanted me to go a different way with this. I think that there are very
>> good practical reasons not to [1], as well as good reasons related to
>> design, but I still accept that CONFLICTING() isn't very descriptive.
>> This spelling seems a lot better.
>
> Specifically, "some people" included at least three committers and at
> least one other community member no less prominent than yourself, or
> in other words, every single person who bothered to comment.  You can
> think whatever you like; the chances of it being committed that way
> are zero.

This is the situation with unique index inference all over again. You
don't like a function-like expression. Okay. It would be a lot more
helpful if instead of just criticizing, you *also* looked at my
reasons for not wanting to go with something that would necessitate
adding a dummy range table entry [1]. There are some very good
practical reasons for avoiding that. We only do that (the OLD.* and
NEW.* stuff) in the context of CREATE RULE and one or two other
things. We don't play any games like that during parse analysis of
ordinary optimizable statements, which is what this is. And those
dummy RTEs are always placeholders, AFAICT. Apart from those technical
reasons, the two situations just aren't comparable from a user-visible
perspective, but that isn't the main problem.

I don't particularly expect you to come around to my view here. But it
would be nice to have the problems with dummy RTEs and so on
acknowledged, so it's understood that that is in itself a strange new
direction for parse analysis of ordinary optimizable statements.

>> Unique index inference (i.e. the way we figure out  *which* unique
>> index to use) occurs during parse analysis. I think it would be
>> inappropriate, and certainly inconvenient to do it during planning.
>
> You're wrong.  The choice of which index to use is clearly wildly
> inappropriately placed in the parse analysis phase, and if you think
> that has any chance of ever being committed by anyone, then you are
> presuming the existence of a committer who won't mind ignoring the
> almost-immediate revert request that would draw from, at the very
> least, Tom.

Why? This has nothing to do with optimization. You either have an
appropriate unique index available, in which case you can proceed (and
the cost associated with value locking the index and so on is an
inherent cost, kind of like inserting into an index in general). Or
you don't, and you can't (you get an error, not a sequential scan).
That's what I'd call a semantic dependence on the index. Yes, that's
kind of unusual, but it's the reality.

I can have the code do unique index inference during planning instead
if you insist, but I think that that isn't useful. If you want me to
do things that way, please explain why. Otherwise, even if I simply
acquiesce to your wishes, I may end up missing the point.

> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
> syntax proposed upthread was the best of any mentioned thus far.  The
> MERGE-based syntaxes proposed upthread were crazily verbose for no
> discernable benefit.

I don't think anyone is too insistent on pursuing that. I think that
the INSERT .. ON CONFLICT UPDATE syntax has more or less emerged as
the favorite.

> As much as I'd like to have this feature, your refusal to change
> anything except when asked at least three times each by about five
> different people makes this effort barely worth pursuing.  You can say
> all you like that you're receptive to feedback, but multiple people
> here are telling you that they feel otherwise.

I'm tired of your assertions about why I haven't done certain things.
It's not fair. I have incorporated a lot of feedback into V1.3 (and
previous versions), which isn't acknowledged. Also, I moved to the USA
this week, and things have been hectic. As I said in relation to
unique index inference, please consider that I haven't immediately
complied with that feedback because there are genuine technical
obstacles that at the very least make it more difficult than it
appears. And, I'm not necessarily able to comply quickly because of
time constraints.

[1] 
http://www.postgresql.org/message-id/cam3swzqhixqi1ost14v7spjqrupmcnrtbxje846-eb1bc+9...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Florian Pflug
On Oct24, 2014, at 20:24 , Robert Haas  wrote:
> On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug  wrote:
>>> What about doing one scan using SnapshotAny and then testing each
>>> returned row for visibility under both relevant snapshots?  See
>>> whether there is any tuple for which they disagree.
>> 
>> See my other mail - testing whether the snapshots agree isn't enough,
>> you'd have to check whether there could have been *any* snapshot taken
>> between the two which would see a different result.
> 
> Oh, hmm.  I had thought what I was proposing was strong enough to
> handle that case, but now I see that it isn't.  However, I'm not
> entirely sure that it's the RI code's job to prevent such cases, or at
> least not when the transaction isolation level is less than
> serializable.

Well, the responsibility was laid onto the RI code by the decision to
not do SIREAD locking for RI enforcement queries. Simply enabling SIREAD
locking without doing anything else is not a good solution, I think -
it will drive up the false-positive rate. And it would make workloads
consisting purely of serializable transactions pay the price for the
row-level locking done by the RI triggers, without getting anything in
return. 

> Is there an argument that the anomaly that results is
> unacceptable at REPEATABLE READ?

I'm not aware of any case that is clearly unacceptable for REPEATABLE
READ, but neither am I convinced that no such case exists. REPEATABLE READ
mode is hard because there doesn't seem to be any formal definition of
what we do and do not guarantee. The guarantees provided by the SQL
standard seem to be so much weaker than what we offer that they're not
really helpful :-(

I believe the best way forward is to first find a solution for SERIALIZABLE
transactions, and then check if it can be applied to REPEATABLE READ
mode too. For SERIALIZABLE mode, it's at least clear what we're aiming
for -- offering true serializability.

best regards,
Florian Pflug





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
Hi Ali

I checked a code. I am thinking so code organization is not good.
accumArrayResult is too long now. makeMdArrayResult will not work, when
arrays was joined (it is not consistent now). I don't like a usage of
state->is_array_accum in array_userfunc.c -- it is signal of wrong
wrapping.

next question: there is function array_append(anyarray, anyelement). Isn't
time to define array_append(anyarray, anyarray) now?

Regards

Pavel

2014-10-24 15:05 GMT+02:00 Pavel Stehule :

>
>
> 2014-10-24 13:58 GMT+02:00 Pavel Stehule :
>
>>
>>
>> 2014-10-24 11:43 GMT+02:00 Ali Akbar :
>>
>>>
>>> 2014-10-24 16:26 GMT+07:00 Pavel Stehule :
>>>
 Hi

 some in last patch is wrong, I cannot to compile it:

 arrayfuncs.c: In function ‘accumArrayResult’:
 arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’
astate->alen = 64;  /* arbitrary starting array size */
  ^
 arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named
 ‘dvalues’
astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
  ^
 arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named
 ‘alen’
astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
 ^
 arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named
 ‘dnulls’
astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
  ^
 arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named
 ‘alen’
astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
   ^
 arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named
 ‘nelems’
astate->nelems = 0;
  ^
 arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named
 ‘nelems’
if (astate->nelems >= astate->alen)
  ^
 arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named
 ‘alen’
if (astate->nelems >= astate->alen)
^
 arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named
 ‘alen’
 astate->alen *= 2;

>>>
>>> Sorry,  correct patch attached.
>>>
>>> This patch is in patience format (git --patience ..). In previous
>>> patches, i use context format (git --patience ... | filterdiff
>>> --format=context), but it turns out that some modification is lost.
>>>
>>
>> last version is compileable, but some is still broken
>>
>> postgres=# select array_agg(array[i, i+1, i-1])
>> from generate_series(1,2) a(i);
>> ERROR:  could not find array type for data type integer[]
>>
>
> I am sorry, it works - I had a problem with broken database
>
> I fixed small issue in regress tests and I enhanced tests for varlena
> types and null values.
>
> Regards
>
> Pavel
>
>
>>
>> but array(subselect) works
>>
>> postgres=# select array(select a from xx);
>>array
>> ---
>>  {{1,2,3},{1,2,3}}
>> (1 row)
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>>>
>>> --
>>> Ali Akbar
>>>
>>
>>
>


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug  wrote:
>> What about doing one scan using SnapshotAny and then testing each
>> returned row for visibility under both relevant snapshots?  See
>> whether there is any tuple for which they disagree.
>
> See my other mail - testing whether the snapshots agree isn't enough,
> you'd have to check whether there could have been *any* snapshot taken
> between the two which would see a different result.

Oh, hmm.  I had thought what I was proposing was strong enough to
handle that case, but now I see that it isn't.  However, I'm not
entirely sure that it's the RI code's job to prevent such cases, or at
least not when the transaction isolation level is less than
serializable.  Is there an argument that the anomaly that results is
unacceptable at REPEATABLE READ?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How ugly would this be? (ALTER DATABASE)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 2:06 PM, Joshua D. Drake  wrote:
> One of the things we run into quite a bit is customers who are using
> multiple databases when they should be using multiple schemas. I am sure
> other consultants run into this as well. This gets even more difficult as
> uptime requirements have become all but 100%. So my question is, what would
> this take?
>
> ALTER DATABASE foo LOCATION DATABASE bar SCHEMA baz?
>
> Where if we execute that command, database foo would move to schema baz
> within database bar?
>
> I am fully aware of what it takes on the client side but structurally within
> postgres what would it take? Is it even reasonable?

What if the database contains more than one schema?

You could perhaps try to create a command that would move a schema
between two databases in the same cluster.  It's fraught with
practical difficulties because a single backend can't be connected to
both databases at the same time, so how exactly do you make the
required catalog changes all in a single transaction?  But if you
imagine that you have an infinite pool of top-notch PostgreSQL talent
with unbounded time to work on this problem and no other, I bet
somebody could engineer a solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Florian Pflug
On Oct24, 2014, at 19:32 , Robert Haas  wrote:
> On Fri, Oct 24, 2014 at 1:28 PM, Florian Pflug  wrote:
>> The only other option I see would be so teach the executor to check
>> whether *any* snapshot between the transaction's snapshot and a current
>> snapshot would see a different set of rows. Simply checking whether both
>> the current snapshot and the transaction's snapshot see the same set isn't
>> sufficient, per the counter-example in my other mail :-(
> 
> What about doing one scan using SnapshotAny and then testing each
> returned row for visibility under both relevant snapshots?  See
> whether there is any tuple for which they disagree.

See my other mail - testing whether the snapshots agree isn't enough,
you'd have to check whether there could have been *any* snapshot taken
between the two which would see a different result. Or at least what I
currently think - I don't yet have an actual proof at hand that doing
this always preserves serializability. But it seems plausible that it
does, because it ensures that there is a rather large time frame during
which the enforcement query can be placed in a serial schedule without
changing its result.

This applies only to SERIALIZABLE transactions, though. For REPEATABLE
READ, it might be that checking whether the two snapshots agree is
sufficient.

So a third way forward would be to not skip the SSI logic for RI enforcement
queries. Though I fear that, due to the imprecise nature of SIREAD locks,
doing so would drive the false positive rate way up for workloads involving 
lots of parent and child row modifications.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How ugly would this be? (ALTER DATABASE)

2014-10-24 Thread Joshua D. Drake


Hello,

One of the things we run into quite a bit is customers who are using 
multiple databases when they should be using multiple schemas. I am sure 
other consultants run into this as well. This gets even more difficult 
as uptime requirements have become all but 100%. So my question is, what 
would this take?


ALTER DATABASE foo LOCATION DATABASE bar SCHEMA baz?

Where if we execute that command, database foo would move to schema baz 
within database bar?


I am fully aware of what it takes on the client side but structurally 
within postgres what would it take? Is it even reasonable?


JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
 not be surprised when they come back as Romans."


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 9:05 AM, Heikki Linnakangas
 wrote:
> So, this is what I came up with for master. Does anyone see a problem with
> it?

In the proposed commit message, you mis-spelled "significant" as "signicant".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views don't show up in information_schema

2014-10-24 Thread Robert Haas
On Fri, Oct 17, 2014 at 8:10 PM, Stephen Frost  wrote:
> * Peter Eisentraut (pete...@gmx.net) wrote:
>> On 10/16/14 9:45 AM, Stephen Frost wrote:
>> > Alright, coming back to this, I have to ask- how are matviews different
>> > from views from the SQL standard's perspective?  I tried looking through
>> > the standard to figure it out (and I admit that I probably missed
>> > something), but the only thing appears to be a statement in the standard
>> > that (paraphrased) "functions are run with the view is queried" and that
>> > strikes me as a relatively minor point..
>>
>> To me, the main criterion is that you cannot DROP VIEW a materialized view.
>
> That is an entirely correctable situation.  We don't require 'DROP
> UNLOGGED TABLE'.

I think that's an inapposite comparison.  The fact that a table is
unlogged is merely a property of the table; it does not change the
fact that it is a table.  A materialized view, on the other hand, is
different kind of object from a view.  This manifests itself the fact
that it's represented by a different relkind; and that different
syntax is used not only for DROP but also for COMMENT, ALTER VIEW,
SECURITY LABEL, and ALTER EXTENSION .. ADD/DROP; and that the set of
supported operations on a materialized view is different from a
regular view (and will probably be more different in the future).  If
we really want to change this, we can't just change DROP VIEW; we need
to change all of the places in a consistent fashion, and we probably
have to continue to support the old syntax so that we don't break
existing dumps.

But I think it's the wrong thing anyway, because it presumes that,
when Kevin chose to make materialized views a different relkind and a
different object type, rather than just a property of an object, he
made the wrong call, and I don't agree with that.  I think he got it
exactly right.  A materialized view is really much more like a table
than a view: it has storage and can be vacuumed, clustered, analyzed,
and so on.  That's far more significant IMV than the difference
between a table and unlogged table.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo fixes for pg_recvlogical documentation

2014-10-24 Thread Robert Haas
On Thu, Oct 23, 2014 at 10:00 AM, Michael Paquier
 wrote:
> pg_recvlogical is missing some "=" signs for a couple of option names where
> double-dash is used, like this one:
> --username user
> should be that:
> --username=user
>
> Attached is a patch correcting that.

For reasons that are not clear to me, I cannot apply this patch.

[rhaas pgsql]$ patch -p1 < ~/Downloads/20141023_pg_recvlogical_fixes.patch
patching file doc/src/sgml/ref/pg_recvlogical.sgml
Hunk #1 succeeded at 270 with fuzz 1 (offset 165 lines).
Hunk #2 FAILED at 282.
Hunk #3 FAILED at 295.
Hunk #4 FAILED at 308.
3 out of 4 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/pg_recvlogical.sgml.rej

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 1:28 PM, Florian Pflug  wrote:
> The only other option I see would be so teach the executor to check
> whether *any* snapshot between the transaction's snapshot and a current
> snapshot would see a different set of rows. Simply checking whether both
> the current snapshot and the transaction's snapshot see the same set isn't
> sufficient, per the counter-example in my other mail :-(

What about doing one scan using SnapshotAny and then testing each
returned row for visibility under both relevant snapshots?  See
whether there is any tuple for which they disagree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 1:18 PM, Josh Berkus  wrote:
> On 10/24/2014 10:04 AM, Robert Haas wrote:
>> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
>> syntax proposed upthread was the best of any mentioned thus far.  The
>> MERGE-based syntaxes proposed upthread were crazily verbose for no
>> discernable benefit.
>
> For those of us who haven't followed every post in this thread, is there
> somewhere I can see the proposed syntax?

There are a couple of different proposals but this should give you a
feeling for where we're at:

http://www.postgresql.org/message-id/CA+TgmoZN=2ajki1n4jz5bkmyi8r_cpudw+dtoppmtelvmso...@mail.gmail.com

The part I like is the idea of making UPSERT look like an INSERT
statement with an additional clause that says how a unique violation
should be handled.  The part nobody except Peter likes is using
functional notation like CONFLICTING() or EXCLUDED() to pull in values
from the tuple causing the unique violation.  And there are some other
areas of disagreement about particular keywords and so on.  But I
personally like that general style much more than the alternatives
derived from MERGE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 1:13 PM, Jim Nasby  wrote:
> It's a valid concern, but I think the way to handle it if needed is to limit
> the number of connections a user can open. Or perhaps another option would
> be to change the permissions on the related functions (do we check ACLs for
> internal functions?)

I'm not sure dump-and-restore would preserve any properties of
anything in pg_catalog.

Anyway, I think we're getting a bit ahead of ourselves here.  The
questions I need answers to right now are:

- What should we call dsm_unkeep_mapping, if not that?
- Are there remaining complaints about patch #3?
- How can I get somebody to review patch #4?
- Does anyone have a tangible suggestion for how to reduce the code
duplication in patch #6?

The question of where pg_background should ultimately live does
matter, but the answer will be "the -hackers mailing list archives"
unless we can get agreement on the prerequisite patches.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Florian Pflug
On Oct24, 2014, at 18:42 , Robert Haas  wrote:
> I don't think you can count on being able to figure out all of the
> recent lockers by looking at xmax; it can get overwritten.  For
> example, suppose transaction A locks the row and then commits.  Then
> transaction B comes along and again locks the row and commits.  My
> understanding is that B won't create a multi-xact since there's just
> one locker; it'll just stomp on the previous xmax.
> 
> Now, you could change that, but I suspect it would have pretty drastic
> implications on systems that have both foreign keys and long-running
> transactions.

Yes, we'd have to create multi-xids in some cases were we don't do that
today. How big the effect would be, I honestly don't know. I guess it
depends on how costly multi-xid creation is, compared to updating the
parent row (to change its xmax) and updating or deleting child rows.
My hope would be that it's cheap compared to that, but maybe that's not
true, especially since multi-xids are xlog'ed nowadays.

The only other option I see would be so teach the executor to check
whether *any* snapshot between the transaction's snapshot and a current
snapshot would see a different set of rows. Simply checking whether both
the current snapshot and the transaction's snapshot see the same set isn't
sufficient, per the counter-example in my other mail :-(

An advantage of the latter is that I'd but the burden on the session
that attempts to remove a parent row, instead of on the sessions which
add or remove children.

best regards,
Florian Pflug





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Josh Berkus
On 10/24/2014 10:04 AM, Robert Haas wrote:
> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
> syntax proposed upthread was the best of any mentioned thus far.  The
> MERGE-based syntaxes proposed upthread were crazily verbose for no
> discernable benefit.

For those of us who haven't followed every post in this thread, is there
somewhere I can see the proposed syntax?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-24 Thread Jim Nasby

On 10/22/14, 7:49 PM, Joshua D. Drake wrote:


I lean toward a push into core because:


+1


3. I am not sure I buy into the super-user argument. Just because the 
functionality is there, doesn't mean it has to be used.


It's a valid concern, but I think the way to handle it if needed is to limit 
the number of connections a user can open. Or perhaps another option would be 
to change the permissions on the related functions (do we check ACLs for 
internal functions?)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-24 Thread Robert Haas
On Thu, Oct 23, 2014 at 9:43 PM, Peter Geoghegan  wrote:
> * CONFLICTING() is renamed to EXCLUDED(). I know that some people
> wanted me to go a different way with this. I think that there are very
> good practical reasons not to [1], as well as good reasons related to
> design, but I still accept that CONFLICTING() isn't very descriptive.
> This spelling seems a lot better.

Specifically, "some people" included at least three committers and at
least one other community member no less prominent than yourself, or
in other words, every single person who bothered to comment.  You can
think whatever you like; the chances of it being committed that way
are zero.

> Unique index inference (i.e. the way we figure out  *which* unique
> index to use) occurs during parse analysis. I think it would be
> inappropriate, and certainly inconvenient to do it during planning.

You're wrong.  The choice of which index to use is clearly wildly
inappropriately placed in the parse analysis phase, and if you think
that has any chance of ever being committed by anyone, then you are
presuming the existence of a committer who won't mind ignoring the
almost-immediate revert request that would draw from, at the very
least, Tom.

As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
syntax proposed upthread was the best of any mentioned thus far.  The
MERGE-based syntaxes proposed upthread were crazily verbose for no
discernable benefit.

As much as I'd like to have this feature, your refusal to change
anything except when asked at least three times each by about five
different people makes this effort barely worth pursuing.  You can say
all you like that you're receptive to feedback, but multiple people
here are telling you that they feel otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Robert Haas
On Thu, Oct 23, 2014 at 2:41 PM, Florian Pflug  wrote:
> The only reason we need the crosscheck snapshot to do that is because
> children may have been added (and the change committed) *after* the
> transaction which removed the parent has taken its snapshot, but *before*
> that transaction locks the parent row.
>
> My proposal is to instead extend the locking protocol to prevent that.
> Essentially, we have to raise a serialization error whenever
>
>   1) We attempt to exclusively lock a row (this includes updating or deleting
>  it), and
>
>   2) somebody else did hold a share lock on that row recently, and
>
>   3) That somebody is invisible to us according to our snapshot.
>
> My initial attempt to do that failed, because we used to have very little
> means of storing the locking history - the only source of information was
> the xmax field, so any update of a tuple removed information about previous
> lock holders - even if that update was later aborted. I pondered using
> multi-xids for this, but at the time I deemed that too risky - back at the
> time, they had a few wraparound issues and the like which were OK for share
> locks, but not for what I intended.
>
> But now that we have KEY SHARE locks, the situation changes. We now rely on
> multi-xids to a much greater extent, and AFAIK multi-xid wraparound is now
> correctly dealt with. We also already ensure that the information contained
> in multi-xids are preserved across tuple upgrades (otherwise, updating a row
> on which someone holds a KEY SHARE lock would be broken).
>
> So all that is missing, I think, is
>
>   1) To make sure we only remove a multi-xid if none of the xids are invisible
>  to any snapshot (i.e. lie before GlobalXmin or something like that).
>
>   2) When we acquire a lock (either explicitly or implicitly by doing an
>  UPDATE or DELETE) check if all previous committed lock holders are
>  visible according to our snapshot, and raise a serialization error
>  if not.

I don't think you can count on being able to figure out all of the
recent lockers by looking at xmax; it can get overwritten.  For
example, suppose transaction A locks the row and then commits.  Then
transaction B comes along and again locks the row and commits.  My
understanding is that B won't create a multi-xact since there's just
one locker; it'll just stomp on the previous xmax.

Now, you could change that, but I suspect it would have pretty drastic
implications on systems that have both foreign keys and long-running
transactions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make pg_controldata accept "-D dirname"

2014-10-24 Thread Heikki Linnakangas

On 10/24/2014 11:36 AM, Michael Paquier wrote:

On Thu, Sep 25, 2014 at 12:38 PM, Heikki Linnakangas <
hlinnakan...@vmware.com> wrote:


On 09/24/2014 05:48 PM, Abhijit Menon-Sen wrote:


Updated patches attached.



Thanks, applied some version of these.


This set of patches has been applied as b0d81ad but patch 0001 did not make
it in, so pg_controldata is not yet able to accept -D:
$ pg_controldata -D /foo/path
pg_controldata: could not open file "-D/global/pg_control" for reading: No
such file or directory
$ pg_controldata /foo/path
pg_controldata: could not open file "/foo/path/global/pg_control" for
reading: No such file or directory


Argh, looks like I forgot the actual code changes required.

I just noticed that pg_controldata and pg_resetxlog don't check for 
extra arguments:


$ pg_resetxlog data fds sdf sdf
Transaction log reset

Fixed that too.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing lock strength of adding foreign keys

2014-10-24 Thread Robert Haas
On Wed, Oct 22, 2014 at 3:06 AM, Andreas Karlsson  wrote:
> I have been thinking about why we need to grab an AccessExclusiveLock on the
> table with the PK when we add a foreign key. Adding new tables with foreign
> keys to old ones is common so it would be really nice if the lock strength
> could be reduced.
>
> A comment in the code claims that we need to lock so no rows are deleted
> under us and that adding a trigger will lock in AccessExclusive anyway. But
> with MVCC catalog access and the removal of SnapshotNow I do not see why
> adding a trigger would require an exclusive lock. Just locking for data
> changes should be enough.

The use of MVCC catalog access doesn't necessarily mean that adding a
trigger doesn't require an AccessExclusive lock.  Those changes - if I
dare to say so myself - solved a complex and important problem, but
only one of many problems in this area, and people seem prone to
thinking that they solved more problems than they in fact did.

I think instead of focusing on foreign keys, we should rewind a bit
and think about the locking level required to add a trigger.  If we
figure something out there, then we can consider how it affects
foreign keys.  I went looking for previous discussions of remaining
hazards and found these postings:

http://www.postgresql.org/message-id/ca+tgmoy4glsxzk0tao29-ljtcuj0sl1xwcwq51xb-hfysgi...@mail.gmail.com
http://www.postgresql.org/message-id/20893.1393892...@sss.pgh.pa.us
http://www.postgresql.org/message-id/20140306224340.ga3551...@tornado.leadboat.com

As far as triggers are concerned, the issue of skew between the
transaction snapshot and what the ruleutils.c snapshots do seems to be
the principal issue.  Commit e5550d5fec66aa74caad1f79b79826ec64898688
changed pg_get_constraintdef() to use an MVCC snapshot rather than a
current MVCC snapshot; if that change is safe, I am not aware of any
reason why we couldn't change pg_get_triggerdef() similarly.  Barring
further hazards I haven't thought of, I would expect that we could add
a trigger to a relation with only ShareRowExclusiveLock.  Anything
less than ShareRowExclusiveLock would open up strange timing races
around the firing of triggers by transactions writing the table: they
might or might not notice that a trigger had been added before
end-of-transaction, depending on the timing of cache flushes, which
certainly seems no good.  But even RowExclusiveLock seems like a large
improvement over AccessExclusiveLock.

When a constraint trigger - which is used to implement a foreign key -
is added, there are actually TWO tables involved: the table upon which
the trigger will actually fire, and some other table which is
mentioned in passing in the trigger definition.  It's possible that
the locking requirements for the secondary table are weaker since I
don't think the presence of the trigger actually affects runtime
behavior there.  However, there's probably little point in try to
weaken the lock to less than the level required for the main table
because a foreign key involves adding referential integrity triggers
to both tables.

So I tentatively propose (and with due regard for the possibility
others may see dangers that I've missed) that a reasonable goal would
be to lower the lock strength required for both CREATE TRIGGER and ADD
FOREIGN KEY from AccessExclusiveLock to ShareRowExclusiveLock,
allowing concurrent SELECT and SELECT FOR SHARE against the tables,
but not any actual write operations.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree_gin and ranges

2014-10-24 Thread Teodor Sigaev

which would also be nice to fix..



Of course, agree. With rbtree usage instead of tidbitmap hash and semantic 
knowledge about operators in GIN...



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 10:10 AM, Andres Freund  wrote:
> What I was thinking was that you'd append the messages to the layer one
> level deeper than the parent. Then we'd missed the invalidations when
> rolling back the intermediate xact. But since I was quite mistaken
> above, this isn't a problem :)

So, you happy with the patch now?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] uninitialized values in revised prepared xact code

2014-10-24 Thread Heikki Linnakangas

On 10/24/2014 05:40 PM, Andres Freund wrote:

On 2014-10-24 17:13:49 +0300, Heikki Linnakangas wrote:

Yeah. The padding bytes in TwoPhaseFileHeader were not initialized.

That's simple enough to fix, but when I run valgrind, I get a lot whole
bunch of similar messages. A few are from pgstat: the padding bytes in the
pgstat messages are not initialized. One comes from
write_relcache_init_file(); again I believe it's padding bytes being
uninitialized (in FormData_pg_attribute). And one from the XLogInsert from
heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so
forth.. Is it worthwhile to hunt down all of these? If there aren't many
more than these, it probably is worth it, but I fear this might be an
endless effort. Have we been clean of these warnings at any point in the
past?


Did you use the valgrind suppression file in src/tools? It suppresses
some "known harmless" cases.


Ah, I did not. With the file, the original warning that started this 
thread is gone; you added a suppression for it in commit 
9a0a12f683235d3e10b873baba974f6414297a7e.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] uninitialized values in revised prepared xact code

2014-10-24 Thread Andres Freund
On 2014-10-24 17:13:49 +0300, Heikki Linnakangas wrote:
> Yeah. The padding bytes in TwoPhaseFileHeader were not initialized.
> 
> That's simple enough to fix, but when I run valgrind, I get a lot whole
> bunch of similar messages. A few are from pgstat: the padding bytes in the
> pgstat messages are not initialized. One comes from
> write_relcache_init_file(); again I believe it's padding bytes being
> uninitialized (in FormData_pg_attribute). And one from the XLogInsert from
> heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so
> forth.. Is it worthwhile to hunt down all of these? If there aren't many
> more than these, it probably is worth it, but I fear this might be an
> endless effort. Have we been clean of these warnings at any point in the
> past?

Did you use the valgrind suppression file in src/tools? It suppresses
some "known harmless" cases.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-24 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Oct 24, 2014 at 3:39 PM, Tom Lane  wrote:
>> Peter, Dave: maybe you have tweaked things to keep listen_addresses
>> empty and rely only on Unix-socket connections?

> Should be so. The target of this feature is development on OSX, right?
> And most of the time development would be done only on the local
> machine, machine being most of the time a laptop. So instead of adding
> an optional step in configure to enforce the creation of a
> certificate, why not simply encourage people to use listen_addresses =
> '' on OSX by documenting it? Even when working on replication or
> related things on a local machine, it is possible to simply pass the
> socket directory...

Some clients (eg JDBC) don't support Unix-socket connections AFAIK, so
this seems like a rather restricted solution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback

2014-10-24 Thread Heikki Linnakangas

On 10/24/2014 01:24 PM, furu...@pm.nttdata.co.jp wrote:

Sorry, I'm going around in the circle. But I'd like to say again, I
don't think this is good idea. It prevents asynchronous
pg_receivexlog from fsyncing WAL data and sending feedbacks more
frequently at all. They are useful, for example, when we want to
monitor the write location of asynchronous pg_receivexlog in almost
realtime. But if we adopt the idea, since feedback cannot be sent
soon in async mode, pg_stat_replication always returns the

not-up-to-date location.


Why not send a message every 10 seconds when its not sync rep?


Or even after every write(). It's a tiny amount of network traffic anyway.


I understand that send feedback message frequently will keep 
pg_stat_replication up-to-date state.

Are there really no needs who wants to fsync even in async mode ?
I think the people who dislike Data lost will like that idea.


The OS will not sit on the written but not fsync'd data forever, it will 
get flushed to disk in a few seconds even without the fsync. It's just 
that there is no guarantee on when it will hit the disk, but there are 
no strong guarantees in asynchronous replication anyway.



Nevertheless in sync or async, returning feedback and executing
fsync() same as like walreceiver is such a problem?


Probably would be OK. It would increase the I/O a lot, thanks to a lot 
of small writes and fsyncs, which might be surprising for a tool like 
pg_receivexlog.


One idea is to change the default behavior to be like walreceiver, and 
fsync() after every write. But provide an option to get the old 
behavior, "--no-fsync".


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] uninitialized values in revised prepared xact code

2014-10-24 Thread Heikki Linnakangas

On 10/11/2014 10:20 PM, Bruce Momjian wrote:


Uh, was this fixed.  I see a cleanup commit for this C file, but this
report is from June:

commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0
Author: Heikki Linnakangas 
Date:   Fri May 16 09:47:50 2014 +0300

Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.

Commit dd428c79 added dbId and tsId to the xl_xact_commit struct 
but missed
that prepared transaction commits reuse that struct. Fix that.

Because those fields were left unitialized, replaying a commit 
prepared WAL
record in a hot standby node would fail to remove the relcache init 
file.
That can lead to "could not open file" errors on the standby. 
Relcache init
file only needs to be removed when a system table/index is 
rewritten in the
transaction using two phase commit, so that should be rare in 
practice. In
HEAD, the incorrect dbId/tsId values are also used for filtering in 
logical
replication code, causing the transaction to always be filtered out.

Analysis and fix by Andres Freund. Backpatch to 9.0 where hot 
standby was
introduced.


No, that was a different issue.

(more below)


On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote:

Hi,

I've just rerun valgrind for the first time in a while and saw the
following splat. My guess is it exists since bb38fb0d43c, but that's
blindly guessing:

==2049== Use of uninitialised value of size 8
==2049==at 0x4FE66D: EndPrepare (twophase.c:1063)
==2049==by 0x4F231B: PrepareTransaction (xact.c:2217)
==2049==by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
==2049==by 0x79013E: finish_xact_command (postgres.c:2408)
==2049==by 0x78DE97: exec_simple_query (postgres.c:1062)
==2049==by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==by 0x716804: PostmasterMain (postmaster.c:1219)
==2049==by 0x679405: main (main.c:219)
==2049==  Uninitialised value was created by a stack allocation
==2049==at 0x4FE16C: StartPrepare (twophase.c:942)
==2049==
==2049== Syscall param write(buf) points to uninitialised byte(s)
==2049==at 0x5C69640: __write_nocancel (syscall-template.S:81)
==2049==by 0x4FE6AE: EndPrepare (twophase.c:1064)
==2049==by 0x4F231B: PrepareTransaction (xact.c:2217)
==2049==by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
==2049==by 0x79013E: finish_xact_command (postgres.c:2408)
==2049==by 0x78DE97: exec_simple_query (postgres.c:1062)
==2049==by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==by 0x716804: PostmasterMain (postmaster.c:1219)
==2049==by 0x679405: main (main.c:219)
==2049==  Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd
==2049==at 0x4C27B8F: malloc (vg_replace_malloc.c:298)
==2049==by 0x8E766E: AllocSetAlloc (aset.c:853)
==2049==by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627)
==2049==by 0x8A54D3: AtStart_Inval (inval.c:704)
==2049==by 0x4F1DFC: StartTransaction (xact.c:1841)
==2049==by 0x4F28D1: StartTransactionCommand (xact.c:2529)
==2049==by 0x7900A7: start_xact_command (postgres.c:2383)
==2049==by 0x78DAF4: exec_simple_query (postgres.c:860)
==2049==by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==  Uninitialised value was created by a stack allocation
==2049==at 0x4FE16C: StartPrepare (twophase.c:942)

It's probably just padding - twophase.c:1063 is the CRC32 computation of
the record data.


Yeah. The padding bytes in TwoPhaseFileHeader were not initialized.

That's simple enough to fix, but when I run valgrind, I get a lot whole 
bunch of similar messages. A few are from pgstat: the padding bytes in 
the pgstat messages are not initialized. One comes from 
write_relcache_init_file(); again I believe it's padding bytes being 
uninitialized (in FormData_pg_attribute). And one from the XLogInsert 
from heap_insert; there's an uninitialized padding byte in 
xl_heap_insert. And so forth.. Is it worthwhile to hunt down all of 
these? If there aren't many more than these, it probably is worth it, 
but I fear this might be an endless effort. Have we been clean of these 
warnings at any point in the past?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deferring some AtStart* allocations?

2014-10-24 Thread Andres Freund
On 2014-10-24 09:45:59 -0400, Robert Haas wrote:
> On Fri, Oct 24, 2014 at 9:17 AM, Andres Freund  wrote:
> >> If
> >> that subtransaction abouts, AtEOSubXact_Inval() gets called again,
> >> sees that it has messages (that it inherited from the innermost
> >> subtransaction), and takes the exact same code-path that it would have
> >> pre-patch.
> >
> > Consider what happens if the innermost transaction commits and the
> > second innermost one rolls back. AtEOSubXact_Inval() won't do anything
> > because it doesn't have any entries itself.
> 
> This is the part I don't understand.  After the innermost transaction
> commits, it *does* have entries itself.

Sure, otherwise there'd be no problem.

> This whole block is basically just an optimization:

> +   if (myInfo->parent == NULL || myInfo->parent->my_level
> < my_level - 1)
> +   {
> +   myInfo->my_level--;
> +   return;
> +   }
> 
> If we removed that code, then we'd just do this instead:
> 
> /* Pass up my inval messages to parent */
> AppendInvalidationMessages(&myInfo->parent->PriorCmdInvalidMsgs,
>&myInfo->PriorCmdInvalidMsgs);
> 
> /* Pending relcache inval becomes parent's problem too */
> if (myInfo->RelcacheInitFileInval)
> myInfo->parent->RelcacheInitFileInval = true;

Ick. I somehow misimagined that you'd just append them one layer further
up.  I obviously shouldn't review code during a conference.

> > Even though there's still
> > valid cache inval entries containing the innermost xact's version of the
> > catalog, now not valid anymore.
> 
> This part doesn't make sense to me either.  The invalidation entries
> don't put data into the caches; they take data out.  When we change
> stuff, we generate invalidation messages.

What I was thinking was that you'd append the messages to the layer one
level deeper than the parent. Then we'd missed the invalidations when
rolling back the intermediate xact. But since I was quite mistaken
above, this isn't a problem :)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-24 Thread Michael Paquier
On Fri, Oct 24, 2014 at 3:39 PM, Tom Lane  wrote:
> Peter, Dave: maybe you have tweaked things to keep listen_addresses
> empty and rely only on Unix-socket connections?
Should be so. The target of this feature is development on OSX, right?
And most of the time development would be done only on the local
machine, machine being most of the time a laptop. So instead of adding
an optional step in configure to enforce the creation of a
certificate, why not simply encourage people to use listen_addresses =
'' on OSX by documenting it? Even when working on replication or
related things on a local machine, it is possible to simply pass the
socket directory...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 1:42 AM, Craig Ringer  wrote:
> On 10/23/2014 09:21 PM, Robert Haas wrote:
>> Agreed - I think if you want an error check here it should use elog()
>> or ereport(), not Assert().
>
> That's what I originally did, but it's too early for elog.
>
> I'm reluctant to just fprintf(...) to stderr, as there's no way for the
> user to suppress that, and it'll be emitted for each backend start.
> Though on the other hand it really is a "shouldn't happen" case.
>
> So the options seem to be ignoring the error silently or printing to stderr.

Either of those is OK with me.  I think it's a bad idea to use
Assert() to check the results of system calls, because an assertion
failure is supposed to indicate a bug in our code, not Microsoft's
code.  But printing to stderr is an acceptable way of indicating an
error that happens very early, and ignoring doesn't look unreasonable
in this context either.  Yet another option is to do nothing about the
error at the time that it's reported but store the error code
somewhere and use it to generate an error message once the system is
initialized.  I'm tentatively inclined to believe that's more
machinery than this particular case justifies, but will happily defer
to any emerging consensus.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Getting rid of "accept incoming network connections" prompts on OS X

2014-10-24 Thread Robert Haas
On Wed, Oct 22, 2014 at 2:14 AM, edward745  wrote:
> One of the queries in ri_triggers.c has be a little baffled.
>
> For (relatively) obvious reasons, a FK insert triggers a SELECT 1 FROM
> pk_rel ... FOR KEY SHARE.
> For not-so-obvious reasons, a PK delete triggers a SELECT 1 FROM fk_rel ...
> FOR KEY SHARE.
>
> I can't see what the lock on fk_rel achieves. Both operations are already
> contending for the lock on the PK row, which seems like enough to cover
> every eventuality.
>
> And even if the lock serves a purpose, KEY SHARE is an odd choice, since the
> referencing field is, in general, not a "key" in this sense.

Please don't post unrelated questions onto existing mailing list
threads.  Start a new thread for a new topic.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-10-24 Thread Robert Haas
On Fri, Oct 24, 2014 at 9:17 AM, Andres Freund  wrote:
>> If
>> that subtransaction abouts, AtEOSubXact_Inval() gets called again,
>> sees that it has messages (that it inherited from the innermost
>> subtransaction), and takes the exact same code-path that it would have
>> pre-patch.
>
> Consider what happens if the innermost transaction commits and the
> second innermost one rolls back. AtEOSubXact_Inval() won't do anything
> because it doesn't have any entries itself.

This is the part I don't understand.  After the innermost transaction
commits, it *does* have entries itself.  This whole block is basically
just an optimization:

+   if (myInfo->parent == NULL || myInfo->parent->my_level
< my_level - 1)
+   {
+   myInfo->my_level--;
+   return;
+   }

If we removed that code, then we'd just do this instead:

/* Pass up my inval messages to parent */
AppendInvalidationMessages(&myInfo->parent->PriorCmdInvalidMsgs,
   &myInfo->PriorCmdInvalidMsgs);

/* Pending relcache inval becomes parent's problem too */
if (myInfo->RelcacheInitFileInval)
myInfo->parent->RelcacheInitFileInval = true;

That should be basically equivalent.  I mean, we need a bit of surgery
to ensure that the parent entry actually exists before we attach stuff
to it, but that's mechanical.  The whole point here though is that,
pre-patch, the parent has an empty entry and we have one with stuff in
it.  What I think happens in the current code is that we take all of
the stuff in our non-empty entry and move it into the parent's empty
entry, so that the parent ends up with an entry identical to ours but
with a level one less than we had.  We could do that here too,
manufacturing the empty entry and then moving our stuff into it and
then deleting our original entry, but that seems silly; just change
the level and call it good.

IOW, I don't see how I'm *changing* the logic here at all.  Why
doesn't whatever problem you're concerned about exist in the current
coding, too?

> Even though there's still
> valid cache inval entries containing the innermost xact's version of the
> catalog, now not valid anymore.

This part doesn't make sense to me either.  The invalidation entries
don't put data into the caches; they take data out.  When we change
stuff, we generate invalidation messages.  At end of statement, once
we've done CommandCounterIncrement(), we execute those invalidation
messages locally, so that the next cache reload the updated state.  At
end of transaction, once we've committed, we send those invalidation
messages to the shared queue to be executed by everyone else, so that
they also see the updated state.  If we abort a transaction or
sub-transaction, we re-execute those same invalidation messages so
that we flush the new state, forcing the next set of cache reloads to
again pull in the old state.  The patch isn't changing - or not
intentionally anyway - anything about any of that logic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-24 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Oct 24, 2014 at 8:26 AM, Dave Page  wrote:
>> On Fri, Oct 24, 2014 at 7:18 AM, Peter Eisentraut  wrote:
>>> On 10/21/14 1:16 PM, Tom Lane wrote:
 If you do any Postgres development on OS X, you've probably gotten
 seriously annoyed by the way that, every single time you reinstall the
 postmaster executable, you get a dialog box asking whether you'd like
 to allow it to accept incoming network connections.

>>> I used to, but somehow I don't see this anymore.  Just to be sure, I
>>> made sure the firewall is on, checked that postgres is not in the
>>> exception list, rebooted, built postgresql from scratch, ran make check,
>>> but no pop-up.
>>> 
>>> I'm on Yosemite.  Maybe this was changed.

>> I've never seen it on any version of OS X (I've worked my way from
>> Panther to Yosemite). There must be more to it...

I see it every darn time I've changed the postmaster executable.
Maybe there is a difference in security settings?  I have the firewall
enabled and in Settings->Security->General, "Allow apps downloaded from:
Mac App Store and identified developers", which I think is the default.
[ experiments... ]  Hm, setting that to "Anywhere" doesn't change the
results anyway.

> FWIW, with firewall at on, I am used to see this annoying popup window when
> starting an instance manually, make check never complains though.

Ah.  pg_regress sets listen_addresses to empty so that no TCP ports
are opened, hence no firewall complaints from "make check".  However,
as soon as you start a normal installation, you get the complaint,
as even an open port on 127.0.0.1 is enough to provoke it.

Peter, Dave: maybe you have tweaked things to keep listen_addresses
empty and rely only on Unix-socket connections?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Trailing comma support in SELECT statements

2014-10-24 Thread Alex Goncharov
On Tue, Oct 21, 2014 at 10:16 AM, Tom Lane  wrote:

> (Of course, I'm not for the feature w.r.t. SQL either.  But breaking data
> compatibility is just adding an entire new dimension of trouble.
>

Another dimension of the trouble is breaking the operation of the
tools that parse SQL statements for various purposes, e.g. for
dependency analysis.

This is a misfeature for the benefit of edit-lazy users only.

-- Alex


Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-24 Thread Michael Paquier
On Fri, Oct 24, 2014 at 3:05 PM, Heikki Linnakangas
 wrote:
> On 10/23/2014 11:09 AM, Heikki Linnakangas wrote:
>>
>> At least for master, we should consider changing the way the archiving
>> works so that we only archive WAL that was generated in the same server.
>> I.e. we should never try to archive WAL files belonging to another
>> timeline.
>>
>> I just remembered that we discussed a different problem related to this
>> some time ago, at
>>
>> http://www.postgresql.org/message-id/20131212.110002.204892575.horiguchi.kyot...@lab.ntt.co.jp.
>> The conclusion of that was that at promotion, we should not archive the
>> last, partial, segment from the old timeline.
>
>
> So, this is what I came up with for master. Does anyone see a problem with
> it?
Thinking long-term, this is a solid approach, so +1 for it. I just
tested the patch and the extra segment files do not show up anymore.
Patch looks good as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree_gin and ranges

2014-10-24 Thread Heikki Linnakangas

On 10/22/2014 03:01 PM, Teodor Sigaev wrote:

Anyway GIN couldn't be used for ORDER BY clause.


which would also be nice to fix...

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-10-24 Thread Andres Freund
On 2014-10-23 12:04:40 -0400, Robert Haas wrote:
> On Tue, Oct 21, 2014 at 12:00 PM, Andres Freund  
> wrote:
> > On 2014-10-09 15:01:19 -0400, Robert Haas wrote:
> >>  /*
> >> @@ -960,18 +966,38 @@ AtEOXact_Inval(bool isCommit)
> > ...
> >> + /*
> >> +  * We create invalidation stack entries lazily, so the 
> >> parent might
> >> +  * not have one.  Instead of creating one, moving all the 
> >> data over,
> >> +  * and then freeing our own, we can just adjust the level of 
> >> our own
> >> +  * entry.
> >> +  */
> >> + if (myInfo->parent == NULL || myInfo->parent->my_level < 
> >> my_level - 1)
> >> + {
> >> + myInfo->my_level--;
> >> + return;
> >> + }
> >> +
> >
> > I think this bit might not be correct. What if the subxact one level up
> > aborts? Then it'll miss dealing with these invalidation entries. Or am I
> > misunderstanding something?
> 
> One of us is.  I think you're asking about a situation where we have a
> transaction, and a subtransaction, and within that another
> subtransaction.  Only the innermost subtransaction has invalidation
> messages.  At the innermost level, we commit; the above code makes
> those messages the responsibility of the outer subtransaction.

Exactly.

> If
> that subtransaction abouts, AtEOSubXact_Inval() gets called again,
> sees that it has messages (that it inherited from the innermost
> subtransaction), and takes the exact same code-path that it would have
> pre-patch.

Consider what happens if the innermost transaction commits and the
second innermost one rolls back. AtEOSubXact_Inval() won't do anything
because it doesn't have any entries itself. Even though there's still
valid cache inval entries containing the innermost xact's version of the
catalog, now not valid anymore.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ARMv5

2014-10-24 Thread Merlin Moncure
On Wed, Oct 22, 2014 at 6:31 AM, Александр Глухов
 wrote:
> Hello, I have a problem with PostgreSQL. I need to install PostgreSQL on
> ARMv5 with tcp/ip access, but I have no experience in it. I trying to do it
> in LTIB and now I need to create a postgresql.spec file. Could you help me
> in it?

This is offtopic for -hackers.  That being said, have you considered
compiling postgres directly on the device?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree_gin and ranges

2014-10-24 Thread Heikki Linnakangas

On 10/22/2014 03:01 PM, Teodor Sigaev wrote:

Anyway GIN couldn't be used for ORDER BY clause.


which would also be nice to fix...

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
2014-10-24 13:58 GMT+02:00 Pavel Stehule :

>
>
> 2014-10-24 11:43 GMT+02:00 Ali Akbar :
>
>>
>> 2014-10-24 16:26 GMT+07:00 Pavel Stehule :
>>
>>> Hi
>>>
>>> some in last patch is wrong, I cannot to compile it:
>>>
>>> arrayfuncs.c: In function ‘accumArrayResult’:
>>> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>astate->alen = 64;  /* arbitrary starting array size */
>>>  ^
>>> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named
>>> ‘dvalues’
>>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>>>  ^
>>> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>>> ^
>>> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named
>>> ‘dnulls’
>>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>>  ^
>>> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>>   ^
>>> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named
>>> ‘nelems’
>>>astate->nelems = 0;
>>>  ^
>>> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named
>>> ‘nelems’
>>>if (astate->nelems >= astate->alen)
>>>  ^
>>> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>if (astate->nelems >= astate->alen)
>>>^
>>> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’
>>> astate->alen *= 2;
>>>
>>
>> Sorry,  correct patch attached.
>>
>> This patch is in patience format (git --patience ..). In previous
>> patches, i use context format (git --patience ... | filterdiff
>> --format=context), but it turns out that some modification is lost.
>>
>
> last version is compileable, but some is still broken
>
> postgres=# select array_agg(array[i, i+1, i-1])
> from generate_series(1,2) a(i);
> ERROR:  could not find array type for data type integer[]
>

I am sorry, it works - I had a problem with broken database

I fixed small issue in regress tests and I enhanced tests for varlena types
and null values.

Regards

Pavel


>
> but array(subselect) works
>
> postgres=# select array(select a from xx);
>array
> ---
>  {{1,2,3},{1,2,3}}
> (1 row)
>
> Regards
>
> Pavel
>
>
>
>>
>> --
>> Ali Akbar
>>
>
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 7e5bcd9..f59738a
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** NULL baz(3 rows)
*** 12046,12051 
--- 12046,12067 
   

 
+ array_agg
+
+array_agg(anyarray)
+   
+   
+any
+   
+   
+the same array type as input type
+   
+   input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input.
+  
+ 
+  
+   
+
  average
 
 
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
new file mode 100644
index 2f0680f..8c182a4
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
*** SELECT ARRAY(SELECT oid FROM pg_proc WHE
*** 2238,2243 
--- 2238,2248 
   array
  ---
   {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+ 
+ SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+  array
+ ---
+  {{1},{2},{3},{4},{5}}
  (1 row)
  
 The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
new file mode 100644
index 41e973b..0261fcb
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
*** exprType(const Node *expr)
*** 108,119 
  	type = exprType((Node *) tent->expr);
  	if (sublink->subLinkType == ARRAY_SUBLINK)
  	{
! 		type = get_array_type(type);
! 		if (!OidIsValid(type))
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_OBJECT),
! 	 errmsg("could not find array type for data type %s",
! 			format_type_be(exprType((Node *) tent->expr);
  	}
  }
  else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
--- 108,123 
  	type = exprType((Node *) tent->expr);
  	if (sublink->subLinkType == ARRAY_SUBLINK)
  	{
! 		if (!OidIsValid(get_element_type(type)))
! 		{
! 			/* not array, so check for its array type */
! 			type = get_array_type(type);
! 			if (!OidIsValid(type))
! ereport(ERROR,
! 		(errcode(ERRCODE_UNDEFINED_OBJECT),
! 		 errmsg("could not find array type for data type %s",
! format_type_be(exprType((Node 

Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-24 Thread Heikki Linnakangas

On 10/23/2014 11:09 AM, Heikki Linnakangas wrote:

At least for master, we should consider changing the way the archiving
works so that we only archive WAL that was generated in the same server.
I.e. we should never try to archive WAL files belonging to another timeline.

I just remembered that we discussed a different problem related to this
some time ago, at
http://www.postgresql.org/message-id/20131212.110002.204892575.horiguchi.kyot...@lab.ntt.co.jp.
The conclusion of that was that at promotion, we should not archive the
last, partial, segment from the old timeline.


So, this is what I came up with for master. Does anyone see a problem 
with it?


- Heikki

>From 9115276335d37d7d2dab55ff8b0642041c16bfc3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Fri, 24 Oct 2014 15:39:02 +0300
Subject: [PATCH] Don't try to archive WAL from different timelines.

Previously, we would always create a .done file for files restored from the
archive, or streamed from a master server. That served two purposes: it
allowed the files to be recycled at restartpoints, and ensured that we don't
try to re-archive the files after promotion. However, there were some gaps
in that approach: if streaming replication was stopped at a segment boundary,
no .done file was created. The standby server might also crash just before
it has created the .done file.

This is a more wholesale solution to the problem. WAL files belonging to
a different timeline are never archived, regardless of .done or .ready
files. The rule is that the server that generates a WAL file is solely
responsible for archiving it.

This also changes the long-standing behavior at end of recovery, where we
archived the last, partial, WAL segment from the old timeline. We no longer
do that; if you copy any files manually to pg_xlog, and want them to be
archived, you should copy them manually to the archive, too. (The first
segment on the new timeline, which is copied from the last partial one, will
be archived as usual.)

This is the same idea as in commit c9cc7e05c6d82a9781883a016c70d95aa4923122,
which was reverted in favor of the .done file approach in commit
1bd42cd70abdbc946ad64c3c8eaefed4bb8b1145. Turns out the .done file approach
was not such a good idea after all. The behavior at end-of-recovery was
discussed on pgsql-hackers back in February 2014
(20131212.110002.204892575.horiguchi.kyot...@lab.ntt.co.jp), but the plan
was never followed through (my bad, I forgot all about).

This should fix the issue reported by Jehan-Guillaume de Rorthais, where
a standby server sometimes creates .ready files for WAL files streamed from
the master. But this is too risky to backpatch - the changes in behavior
are quite signicant - so we'll have to fix it with some other approach in
back-branches.

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e77af22..2370467 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -208,7 +208,8 @@ static int	LocalXLogInsertAllowed = -1;
  * When ArchiveRecoveryRequested is set, archive recovery was requested,
  * ie. recovery.conf file was present. When InArchiveRecovery is set, we are
  * currently recovering using offline XLOG archives. These variables are only
- * valid in the startup process.
+ * valid in the startup process, but there is a copy of InArchiveRecovery
+ * in shared memory.
  *
  * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
  * currently performing crash recovery using only XLOG files in pg_xlog, but
@@ -546,9 +547,11 @@ typedef struct XLogCtlData
 
 	/*
 	 * SharedRecoveryInProgress indicates if we're still in crash or archive
-	 * recovery.  Protected by info_lck.
+	 * recovery.  SharedInArchiveRecovery indicates if we're currently in
+	 * archive recovery.  Protected by info_lck.
 	 */
 	bool		SharedRecoveryInProgress;
+	bool		SharedInArchiveRecovery;
 
 	/*
 	 * SharedHotStandbyActive indicates if we're still in crash or archive
@@ -748,6 +751,7 @@ static MemoryContext walDebugCxt = NULL;
 #endif
 
 static void readRecoveryCommandFile(void);
+static void enterArchiveRecovery(void);
 static void exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo);
 static bool recoveryStopsBefore(XLogRecord *record);
 static bool recoveryStopsAfter(XLogRecord *record);
@@ -4168,9 +4172,7 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 			{
 ereport(DEBUG1,
 		(errmsg_internal("reached end of WAL in pg_xlog, entering archive recovery")));
-InArchiveRecovery = true;
-if (StandbyModeRequested)
-	StandbyMode = true;
+enterArchiveRecovery();
 
 /* initialize minRecoveryPoint to this record */
 LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
@@ -4866,6 +4868,7 @@ XLOGShmemInit(void)
 	 */
 	XLogCtl->XLogCacheBlck = XLOGbuffers - 1;
 	XLogCtl->SharedRecoveryInProgress = true;
+	XLogCtl->SharedInArchiveRecovery = false;
 	XLogCtl->SharedHotStandbyActive = false;
 	

Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Alvaro Herrera
Michael Paquier wrote:

> That's not surprising, sometimes filterdiff misses the shot.

Really?  Wow, that's bad news.  I've been using it to submit patches
from time to time ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
Hi

I did some performance tests and it is interesting:

it is about 15% faster than original implementation.

Regards

Pavel




2014-10-24 13:58 GMT+02:00 Pavel Stehule :

>
>
> 2014-10-24 11:43 GMT+02:00 Ali Akbar :
>
>>
>> 2014-10-24 16:26 GMT+07:00 Pavel Stehule :
>>
>>> Hi
>>>
>>> some in last patch is wrong, I cannot to compile it:
>>>
>>> arrayfuncs.c: In function ‘accumArrayResult’:
>>> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>astate->alen = 64;  /* arbitrary starting array size */
>>>  ^
>>> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named
>>> ‘dvalues’
>>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>>>  ^
>>> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>>> ^
>>> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named
>>> ‘dnulls’
>>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>>  ^
>>> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>>   ^
>>> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named
>>> ‘nelems’
>>>astate->nelems = 0;
>>>  ^
>>> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named
>>> ‘nelems’
>>>if (astate->nelems >= astate->alen)
>>>  ^
>>> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’
>>>if (astate->nelems >= astate->alen)
>>>^
>>> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’
>>> astate->alen *= 2;
>>>
>>
>> Sorry,  correct patch attached.
>>
>> This patch is in patience format (git --patience ..). In previous
>> patches, i use context format (git --patience ... | filterdiff
>> --format=context), but it turns out that some modification is lost.
>>
>
> last version is compileable, but some is still broken
>
> postgres=# select array_agg(array[i, i+1, i-1])
> from generate_series(1,2) a(i);
> ERROR:  could not find array type for data type integer[]
>
> but array(subselect) works
>
> postgres=# select array(select a from xx);
>array
> ---
>  {{1,2,3},{1,2,3}}
> (1 row)
>
> Regards
>
> Pavel
>
>
>
>>
>> --
>> Ali Akbar
>>
>
>


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
2014-10-24 11:43 GMT+02:00 Ali Akbar :

>
> 2014-10-24 16:26 GMT+07:00 Pavel Stehule :
>
>> Hi
>>
>> some in last patch is wrong, I cannot to compile it:
>>
>> arrayfuncs.c: In function ‘accumArrayResult’:
>> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’
>>astate->alen = 64;  /* arbitrary starting array size */
>>  ^
>> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named
>> ‘dvalues’
>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>>  ^
>> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’
>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>> ^
>> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’
>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>  ^
>> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’
>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>   ^
>> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’
>>astate->nelems = 0;
>>  ^
>> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named
>> ‘nelems’
>>if (astate->nelems >= astate->alen)
>>  ^
>> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’
>>if (astate->nelems >= astate->alen)
>>^
>> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’
>> astate->alen *= 2;
>>
>
> Sorry,  correct patch attached.
>
> This patch is in patience format (git --patience ..). In previous patches,
> i use context format (git --patience ... | filterdiff --format=context),
> but it turns out that some modification is lost.
>

last version is compileable, but some is still broken

postgres=# select array_agg(array[i, i+1, i-1])
from generate_series(1,2) a(i);
ERROR:  could not find array type for data type integer[]

but array(subselect) works

postgres=# select array(select a from xx);
   array
---
 {{1,2,3},{1,2,3}}
(1 row)

Regards

Pavel



>
> --
> Ali Akbar
>


Re: [HACKERS] Question about RI checks

2014-10-24 Thread Florian Pflug
On Oct23, 2014, at 17:45 , Kevin Grittner  wrote:
> Every way I look at it, inside a REPEATABLE READ or SERIALIZABLE
> transaction a check for child rows when validating a parent DELETE
> should consider both rows which exist according to the transaction
> snapshot and according to a "current" snapshot.

I've pondered this further, and unfortunately it seems that this
isn't sufficient to guarantee true serializability :-(

Verifying that both snapshots contain exactly the same rows does not
prevent a child row from being inserted and immediately deleted again,
not if both actions happen *after* the parent-updating transaction
took its snapshot, but *before* it takes the crosscheck snapshot.

Let parent(id) and child(id, parent_id) again be two tables with a
FK constraint between them, let  be initially empty, and let
 contain a single row (1).

Assume PD is a transaction which deletes all rows from ,
CI a transaction which inserts the row (1, 1) into , and
CD a transaction which deletes that row again.

Even with the extended cross-checking you propose, we'd still allow
the following concurrent schedule

 1. PD takes snapshot
 2. CI starts and completes
 3. CD starts and completes
 4. PD deletes from  without complaining, since there were
no conflicting rows at time (1), and none at time (4).

So far, all is well. But add two more tables, called 
and , both initially containing one row. Let CI scan
, let PD delete from  and scan ,
and let CD delete from . In the concurrent schedule from
above, CI will see the row in , and PD will delete it, and
PD will see the row in  that CD deletes. Note that SSI *will*
allow that schedule to occur without raising a serialization error
The only serial schedule which yields the same results for the various
queries pertaining  and  is

  CI -> PD -> CD,

i.e. PD has to run *between* CI and CD. But in that serial schedule,
PD *should* see a FK key violation, since CI has inserted a child which
CD hasn't yet deleted.

There is thus *no* serial schedule which yields the same results as the
concurrent schedule above for the queries pertaining  and ,
*and* for the queries pertaining  and , i.e
the concurrent schedule is *not* serializable. Yet even the extended cross-
check won't detect this.

Attached is an isolationtester spec file which implements this example,
and the corresponding out-file which shows that SSI permits the concurrent
schedule. Since SSI doesn't concern itself with RI enforcement queries,
it would also permit that schedule if we extended the cross-check, I think.

(I used REL9_4_STABLE as of today to try this, commit
 1cf54b00ba2100083390223a8244430643c1ec07)

best regards,
Florian Pflug


fk-consistency2.spec
Description: Binary data


fk-consistency2.out
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Michael Paquier
On Fri, Oct 24, 2014 at 11:43 AM, Ali Akbar  wrote:

>
> 2014-10-24 16:26 GMT+07:00 Pavel Stehule :
>
>> Hi
>>
>> some in last patch is wrong, I cannot to compile it:
>>
>> arrayfuncs.c: In function 'accumArrayResult':
>> arrayfuncs.c:4603:9: error: 'ArrayBuildState' has no member named 'alen'
>>astate->alen = 64;  /* arbitrary starting array size */
>>  ^
>> arrayfuncs.c:4604:9: error: 'ArrayBuildState' has no member named
>> 'dvalues'
>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>>  ^
>> arrayfuncs.c:4604:44: error: 'ArrayBuildState' has no member named 'alen'
>>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>> ^
>> arrayfuncs.c:4605:9: error: 'ArrayBuildState' has no member named 'dnulls'
>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>  ^
>> arrayfuncs.c:4605:42: error: 'ArrayBuildState' has no member named 'alen'
>>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>>   ^
>> arrayfuncs.c:4606:9: error: 'ArrayBuildState' has no member named 'nelems'
>>astate->nelems = 0;
>>  ^
>> arrayfuncs.c:4618:13: error: 'ArrayBuildState' has no member named
>> 'nelems'
>>if (astate->nelems >= astate->alen)
>>  ^
>> arrayfuncs.c:4618:31: error: 'ArrayBuildState' has no member named 'alen'
>>if (astate->nelems >= astate->alen)
>>^
>> arrayfuncs.c:4620:10: error: 'ArrayBuildState' has no member named 'alen'
>> astate->alen *= 2;
>>
>
> Sorry,  correct patch attached.
>
> This patch is in patience format (git --patience ..). In previous patches,
> i use context format (git --patience ... | filterdiff --format=context),
> but it turns out that some modification is lost.
>
That's not surprising, sometimes filterdiff misses the shot.
-- 
Michael


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-24 Thread Andres Freund
On 2014-10-24 15:59:30 +0530, Amit Kapila wrote:
> > > and w.r.t performance it can lead extra
> > > function call, few checks and I think in some cases even can
> > > acquire/release spinlock.
> >
> > I fail to see how that could be the case.
> 
> Won't it happen incase first backend sets releaseOK to true and another
> backend which tries to wakeup waiters on lock will acquire spinlock
> and tries to release the waiters.

Sure, that can happen.

> > And again, this is code that's
> > only executed around a couple syscalls. And the cacheline will be
> > touched around there *anyway*.
> 
> Sure, but I think syscalls are required in case we need to wake any
> waiter.

It won't wake up a waiter if there's none on the list.
> > > > And it'd be a pretty pointless
> > > > behaviour, leading to useless increased contention. The only time it'd
> > > > make sense for X to be woken up is when it gets run faster than the S
> > > > processes.
> > >
> > > Do we get any major benefit by changing the logic of waking up waiters?
> >
> > Yes.
> 
> I think one downside I could see of new strategy is that the chance of
> Exclusive waiter to take more time before getting woked up is increased
> as now it will by pass Exclusive waiters in queue.

Note that that *already* happens for any *new* shared locker that comes
in. It doesn't really make sense to have share lockers queued behind the
exclusive locker if others just go in front of it anyway.

> > > Code is more readable, but I don't understand why you
> > > want to do refactoring as part of this patch which ideally
> > > doesn't get any benefit from the same.
> >
> > I did it first without. But there's required stuff like
> > LWLockDequeueSelf(). And I had several bugs because of the list stuff.
> >
> > And I did separate the conversion into a separate patch?
> 
> Yeah, but the main patch for wait free LW_SHARED also uses
> it.

Well, the only thing that it could have done given that the other patch
is a preqrequisite is reverting the behaviour?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-24 Thread Amit Kapila
On Wed, Oct 22, 2014 at 8:04 PM, Andres Freund 
wrote:
> On 2014-10-21 19:56:05 +0530, Amit Kapila wrote:
> > On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund 
> > wrote:
> > spin_delay_count gives
> > how much delay has happened to acquire spinlock which when
> > combined with other stats gives the clear situation about
> > the contention around aquisation of corresponding LWLock.
> > Now if we want to count the spin lock delay for Release call
> > as well, then the meaning of the stat is getting changed.
> > It might be that new meaning of spin_delay_count stat is more
> > useful in some situations, however the older one has its own
> > benefits, so I am not sure if changing this as part of this
> > patch is the best thing to do.
>
> In which case does the old definition make sense, where the new one
> doesn't? I don't think it exists.
>
> And changing it here seems to make sense because spinlock contention
> fundamentally changes it meaning for lwlocks anyway as in most paths we
> don't take a spinlock anymore.

On second thought, I think probably you are right here.

> > > > Why can't we decrement the nwaiters after waking up? I don't think
> > > > there is any major problem even if callers do that themselves, but
> > > > in some rare cases LWLockRelease() might spuriously assume that
> > > > there are some waiters and tries to call LWLockWakeup().  Although
> > > > this doesn't create any problem, keeping the value sane is good
unless
> > > > there is some problem in doing so.
> > >
> > > That won't work because then LWLockWakeup() wouldn't be called when
> > > necessary - precisely because nwaiters is 0.
> >
> > > The reason I've done so is that it's otherwise much harder to debug
> > > issues where there are backend that have been woken up already, but
> > > haven't rerun yet. Without this there's simply no evidence of that
> > > state. I can't see this being relevant for performance, so I'd rather
> > > have it stay that way.
> >
> > I am not sure what useful information we can get during debugging by not
> > doing this in LWLockWakeup()
>
> It's useful because you can detect backends that have been scheduled to
> acquire the lock, but haven't yet. They're otherwise "invisible".
>
> > and w.r.t performance it can lead extra
> > function call, few checks and I think in some cases even can
> > acquire/release spinlock.
>
> I fail to see how that could be the case.

Won't it happen incase first backend sets releaseOK to true and another
backend which tries to wakeup waiters on lock will acquire spinlock
and tries to release the waiters.

> And again, this is code that's
> only executed around a couple syscalls. And the cacheline will be
> touched around there *anyway*.

Sure, but I think syscalls are required in case we need to wake any
waiter.

> >
> >
> > In the above code, if the first waiter to be woken up is Exclusive
waiter,
> > then it will woke that waiter, else shared waiters till it got
> > the first exclusive waiter and then first exlusive waiter.
>
> That's would be bug then.

I am not sure of it, but I think it's more important to validate the
new waking startegy as you see benefits by doing so.

>Per the comment you quoted "If the front
> waiter wants exclusive lock, awaken him only. Otherwise awaken as many
> waiters as want shared access.".
>
> But I don't think it's what's happening. Note that 'proc =
> proc->lwWaitLink;' is only executed if 'proc->lwWaitLink->lwWaitMode !=
> LW_EXCLUSIVE'. Which is the next waiter...
>
>
> > > And it'd be a pretty pointless
> > > behaviour, leading to useless increased contention. The only time it'd
> > > make sense for X to be woken up is when it gets run faster than the S
> > > processes.
> >
> > Do we get any major benefit by changing the logic of waking up waiters?
>
> Yes.

I think one downside I could see of new strategy is that the chance of
Exclusive waiter to take more time before getting woked up is increased
as now it will by pass Exclusive waiters in queue.  I don't have any
concrete proof that it can do any harm to performance, so may be it's
okay to have this new mechanism, however I think it might be helpful
if you could add a comment in code to explain the benefit by skipping
Exclusive lockers.

> > Code is more readable, but I don't understand why you
> > want to do refactoring as part of this patch which ideally
> > doesn't get any benefit from the same.
>
> I did it first without. But there's required stuff like
> LWLockDequeueSelf(). And I had several bugs because of the list stuff.
>
> And I did separate the conversion into a separate patch?

Yeah, but the main patch for wait free LW_SHARED also uses
it.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback

2014-10-24 Thread furuyao
> >> Sorry, I'm going around in the circle. But I'd like to say again, I
> >> don't think this is good idea. It prevents asynchronous
> >> pg_receivexlog from fsyncing WAL data and sending feedbacks more
> >> frequently at all. They are useful, for example, when we want to
> >> monitor the write location of asynchronous pg_receivexlog in almost
> >> realtime. But if we adopt the idea, since feedback cannot be sent
> >> soon in async mode, pg_stat_replication always returns the
> not-up-to-date location.
> >
> > Why not send a message every 10 seconds when its not sync rep?
> 
> Or even after every write(). It's a tiny amount of network traffic anyway.

I understand that send feedback message frequently will keep 
pg_stat_replication up-to-date state.

Are there really no needs who wants to fsync even in async mode ?
I think the people who dislike Data lost will like that idea.
Thought?

Nevertheless in sync or async, returning feedback and executing fsync() same as 
like walreceiver is such a problem?

--
Furuya Osamu


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] detect custom-format dumps in psql and emit a useful error

2014-10-24 Thread Andres Freund
On 2014-10-24 07:18:55 -0300, Alvaro Herrera wrote:
> Jeevan Chalke wrote:
> 
> > Regarding Directory Error:
> > ===
> > I strongly against the proposal. This patch changing error message to
> > something like this:
> > "psql:blah:0: Input path is a directory. Use pg_restore to restore
> > directory-format database dumps."
> > 
> > So even though I accidentally provide a directory instead of a sql script
> > file when I have NO intention of restoring a dump, above message looks
> > weired. Instead current message looks perfectly fine here. i.e.
> > "could not read from input file: Is a directory"
> > 
> > psql always expect a file and NOT directory. Also it is not necessarily
> > working on restoring a dump.
> 
> Yeah, this patch is a lot more debatable than the other one.  I have
> pushed the first one without changing the error message.

We could just test for toc.dat and then emit the warning...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] detect custom-format dumps in psql and emit a useful error

2014-10-24 Thread Alvaro Herrera
Jeevan Chalke wrote:

> Regarding Directory Error:
> ===
> I strongly against the proposal. This patch changing error message to
> something like this:
> "psql:blah:0: Input path is a directory. Use pg_restore to restore
> directory-format database dumps."
> 
> So even though I accidentally provide a directory instead of a sql script
> file when I have NO intention of restoring a dump, above message looks
> weired. Instead current message looks perfectly fine here. i.e.
> "could not read from input file: Is a directory"
> 
> psql always expect a file and NOT directory. Also it is not necessarily
> working on restoring a dump.

Yeah, this patch is a lot more debatable than the other one.  I have
pushed the first one without changing the error message.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TODO: Helpful hint from psql on createdb

2014-10-24 Thread Craig Ringer
Hi all

Following on from my recent patch about adding a hint to psql when you
try to use it to apply a custom-format dump:

http://www.postgresql.org/message-id/544096e0.5020...@2ndquadrant.com

I think we should do the same for the command line tools, like createdb,
createuser, etc.

I've seen this often enough:

postgres=> createdb test
postgres->


postgres->



and it should be solveable with the same approach I used for PGDMP.

It's only a couple of hours work at most but I'm not going to get to it
in a hurry, so let's pop it on the TODO for volunteers.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ARMv5

2014-10-24 Thread Александр Глухов

Hello, I have a problem with PostgreSQL. I need to install PostgreSQL on ARMv5 
with tcp/ip access, but I have no experience in it. I trying to do it in LTIB 
and now I need to create a postgresql.spec file. Could you help me in it?

With best regards,
Alexandr Glukhov
 
 

Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Ali Akbar
2014-10-24 16:26 GMT+07:00 Pavel Stehule :

> Hi
>
> some in last patch is wrong, I cannot to compile it:
>
> arrayfuncs.c: In function ‘accumArrayResult’:
> arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’
>astate->alen = 64;  /* arbitrary starting array size */
>  ^
> arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named ‘dvalues’
>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
>  ^
> arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’
>astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
> ^
> arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’
>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>  ^
> arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’
>astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
>   ^
> arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’
>astate->nelems = 0;
>  ^
> arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named ‘nelems’
>if (astate->nelems >= astate->alen)
>  ^
> arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’
>if (astate->nelems >= astate->alen)
>^
> arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’
> astate->alen *= 2;
>

Sorry,  correct patch attached.

This patch is in patience format (git --patience ..). In previous patches,
i use context format (git --patience ... | filterdiff --format=context),
but it turns out that some modification is lost.

-- 
Ali Akbar
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..f59738a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12046,6 +12046,22 @@ NULL baz(3 rows)
  
   

+array_agg
+   
+   array_agg(anyarray)
+  
+  
+   any
+  
+  
+   the same array type as input type
+  
+  input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input.
+ 
+
+ 
+  
+   
 average


diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 2f0680f..8c182a4 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
  array
 ---
  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+
+SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+ array
+---
+ {{1},{2},{3},{4},{5}}
 (1 row)
 
The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 41e973b..0261fcb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -108,12 +108,16 @@ exprType(const Node *expr)
 	type = exprType((Node *) tent->expr);
 	if (sublink->subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg("could not find array type for data type %s",
-			format_type_be(exprType((Node *) tent->expr);
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("could not find array type for data type %s",
+format_type_be(exprType((Node *) tent->expr);
+		}
 	}
 }
 else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
@@ -139,12 +143,16 @@ exprType(const Node *expr)
 	type = subplan->firstColType;
 	if (subplan->subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg("could not find array type for data type %s",
-	format_type_be(subplan->firstColType;
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("could not find array type for data type %s",
+		format_type_be(subplan->firstColType;
+		}
 	}
 }
 else if (subplan->subLinkType == MULTIEXPR_SUBLINK)
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 3e7dc85..8fc8b49 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b

Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
Hi

some in last patch is wrong, I cannot to compile it:

arrayfuncs.c: In function ‘accumArrayResult’:
arrayfuncs.c:4603:9: error: ‘ArrayBuildState’ has no member named ‘alen’
   astate->alen = 64;  /* arbitrary starting array size */
 ^
arrayfuncs.c:4604:9: error: ‘ArrayBuildState’ has no member named ‘dvalues’
   astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
 ^
arrayfuncs.c:4604:44: error: ‘ArrayBuildState’ has no member named ‘alen’
   astate->dvalues = (Datum *) palloc(astate->alen * sizeof(Datum));
^
arrayfuncs.c:4605:9: error: ‘ArrayBuildState’ has no member named ‘dnulls’
   astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
 ^
arrayfuncs.c:4605:42: error: ‘ArrayBuildState’ has no member named ‘alen’
   astate->dnulls = (bool *) palloc(astate->alen * sizeof(bool));
  ^
arrayfuncs.c:4606:9: error: ‘ArrayBuildState’ has no member named ‘nelems’
   astate->nelems = 0;
 ^
arrayfuncs.c:4618:13: error: ‘ArrayBuildState’ has no member named ‘nelems’
   if (astate->nelems >= astate->alen)
 ^
arrayfuncs.c:4618:31: error: ‘ArrayBuildState’ has no member named ‘alen’
   if (astate->nelems >= astate->alen)
   ^
arrayfuncs.c:4620:10: error: ‘ArrayBuildState’ has no member named ‘alen’
astate->alen *= 2;


2014-10-24 11:24 GMT+02:00 Ali Akbar :

> 2014-10-24 15:48 GMT+07:00 Pavel Stehule :
>
>> Hi
>>
>> it looks well
>>
>> doc:
>> http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS
>> it should be fixed too
>>
>> Regards
>>
>> Pavel
>>
>
> doc updated with additional example for array(subselect). patch attached.
>
> Regards,
> --
> Ali Akbar
>


Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Ali Akbar
2014-10-24 15:48 GMT+07:00 Pavel Stehule :

> Hi
>
> it looks well
>
> doc:
> http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS
> it should be fixed too
>
> Regards
>
> Pavel
>

doc updated with additional example for array(subselect). patch attached.

Regards,
--
Ali Akbar
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 12046,12051  NULL baz(3 rows)
--- 12046,12067 
   

 
+ array_agg
+
+array_agg(anyarray)
+   
+   
+any
+   
+   
+the same array type as input type
+   
+   input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input.
+  
+ 
+  
+   
+
  average
 
 
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
***
*** 2238,2243  SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
--- 2238,2248 
   array
  ---
   {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+ 
+ SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+  array
+ ---
+  {{1},{2},{3},{4},{5}}
  (1 row)
  
 The subquery must return a single column. The resulting
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
***
*** 108,119  exprType(const Node *expr)
  	type = exprType((Node *) tent->expr);
  	if (sublink->subLinkType == ARRAY_SUBLINK)
  	{
! 		type = get_array_type(type);
! 		if (!OidIsValid(type))
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_OBJECT),
! 	 errmsg("could not find array type for data type %s",
! 			format_type_be(exprType((Node *) tent->expr);
  	}
  }
  else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
--- 108,123 
  	type = exprType((Node *) tent->expr);
  	if (sublink->subLinkType == ARRAY_SUBLINK)
  	{
! 		if (!OidIsValid(get_element_type(type)))
! 		{
! 			/* not array, so check for its array type */
! 			type = get_array_type(type);
! 			if (!OidIsValid(type))
! ereport(ERROR,
! 		(errcode(ERRCODE_UNDEFINED_OBJECT),
! 		 errmsg("could not find array type for data type %s",
! format_type_be(exprType((Node *) tent->expr);
! 		}
  	}
  }
  else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
***
*** 139,150  exprType(const Node *expr)
  	type = subplan->firstColType;
  	if (subplan->subLinkType == ARRAY_SUBLINK)
  	{
! 		type = get_array_type(type);
! 		if (!OidIsValid(type))
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_OBJECT),
! 	 errmsg("could not find array type for data type %s",
! 	format_type_be(subplan->firstColType;
  	}
  }
  else if (subplan->subLinkType == MULTIEXPR_SUBLINK)
--- 143,158 
  	type = subplan->firstColType;
  	if (subplan->subLinkType == ARRAY_SUBLINK)
  	{
! 		if (!OidIsValid(get_element_type(type)))
! 		{
! 			/* not array, so check for its array type */
! 			type = get_array_type(type);
! 			if (!OidIsValid(type))
! ereport(ERROR,
! 		(errcode(ERRCODE_UNDEFINED_OBJECT),
! 		 errmsg("could not find array type for data type %s",
! 		format_type_be(subplan->firstColType;
! 		}
  	}
  }
  else if (subplan->subLinkType == MULTIEXPR_SUBLINK)
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***
*** 668,677  build_subplan(PlannerInfo *root, Plan *plan, PlannerInfo *subroot,
  
  		Assert(!te->resjunk);
  		Assert(testexpr == NULL);
! 		arraytype = get_array_type(exprType((Node *) te->expr));
! 		if (!OidIsValid(arraytype))
! 			elog(ERROR, "could not find array type for datatype %s",
!  format_type_be(exprType((Node *) te->expr)));
  		prm = generate_new_param(root,
   arraytype,
   exprTypmod((Node *) te->expr),
--- 668,683 
  
  		Assert(!te->resjunk);
  		Assert(testexpr == NULL);
! 
! 		arraytype = exprType((Node *) te->expr);
! 		if (!OidIsValid(get_element_type(arraytype)))
! 		{
! 			/* not array, so get the array type */
! 			arraytype = get_array_type(exprType((Node *) te->expr));
! 			if (!OidIsValid(arraytype))
! elog(ERROR, "could not find array type for datatype %s",
! 	 format_type_be(exprType((Node *) te->expr)));
! 		}
  		prm = generate_new_param(root,
   arraytype,
   exprTypmod((Node *) te->expr),
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***
*** 16,22 
  #include "utils/builtins.h"
  #include "utils/lsyscache.h"
  
- 
  /*--

Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Pavel Stehule
Hi

it looks well

doc:
http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS
it should be fixed too

Regards

Pavel

2014-10-24 10:24 GMT+02:00 Ali Akbar :

> Updated patch attached.
>
> 2014-10-22 20:51 GMT+07:00 Pavel Stehule :
>
>> I agree with your proposal. I have a few comments to design:
>>
>
>> 1. patch doesn't hold documentation and regress tests, please append
>> it.
>>
>
> i've added some regression tests in arrays.sql and aggregate.sql.
>
> I've only found the documentation of array_agg. Where is the doc for
> array(subselect) defined?
>
>
>> 2. this functionality (multidimensional aggregation) can be interesting
>> more times, so maybe some interface like array builder should be 
>> preferred.
>>
> We already have accumArrayResult and
> makeArrayResult/makeMdArrayResult, haven't we?
>
> Actually array_agg(anyarray) can be implemented by using
> accumArrayResult and makeMdArrayResult, but in the process we will
> deconstruct the array data and null bit-flags into 
> ArrayBuildState->dvalues
> and dnulls. And we will reconstruct it through makeMdArrayResult. I think
> this way it's not efficient, so i decided to reimplement it and memcpy the
> array data and null flags as-is.
>

 aha, so isn't better to fix a performance for accumArrayResult ?

>>>
>>> Ok, i'll go this route. While reading the code of array(subselect) as
>>> you pointed below, i think the easiest way to implement support for both
>>> array_agg(anyarray) and array(subselect) is to change accumArrayResult so
>>> it operates both with scalar datum(s) and array datums, with performance
>>> optimization for the latter.
>>>
>>
> implemented it by modifying ArrayBuildState to ArrayBuildStateArray and
> ArrayBuildStateScalar (do you have any idea for better struct naming?)
>
>
>> In other places, i think it's clearer if we just use accumArrayResult and
> makeArrayResult/makeMdArrayResult as API for building (multidimensional)
> arrays.
>
>
>> 3. array_agg was consistent with array(subselect), so it should be
>> fixed too
>>
>> postgres=# select array_agg(a) from test;
>>array_agg
>> ---
>>  {{1,2,3,4},{1,2,3,4}}
>> (1 row)
>>
>> postgres=# select array(select a from test);
>> ERROR:  could not find array type for data type integer[]
>>
>
> I'm pretty new in postgresql development. Can you point out where is
> array(subselect) implemented?
>

 where you can start?

 postgres=# explain select array(select a from test);
 ERROR:  42704: could not find array type for data type integer[]
 LOCATION:  exprType, nodeFuncs.c:116

 look to code ... your magic keyword is a ARRAY_SUBLINK .. search in
 postgresql sources this keyword

>>>
>> attention: probably we don't would to allow arrays everywhere.
>>
>
> I've changed the places where i think it's appropriate.
>
>
>> 4. why you use a magic constant (64) there?
>>
>> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
>> + astate->aitems = 64 * nitems;
>>
>> + astate->nullbitmap = (bits8 *)
>> + repalloc(astate->nullbitmap, (astate->aitems + 7) /
>> 8);
>>
>
> just follow the arbitrary size choosen in accumArrayResult
> (arrayfuncs.c):
> astate->alen = 64; /* arbitrary starting array size */
>
> it can be any number not too small and too big. Too small, and we will
> realloc shortly. Too big, we will end up wasting memory.
>

 you can try to alloc 1KB instead as start -- it is used more times in
 Postgres. Then a overhead is max 1KB per agg call - what is acceptable.

 You take this value from accumArrayResult, but it is targeted for
 shorted scalars - you should to expect so any array will be much larger.

>>>
> this patch allocates 1KB (1024 bytes) if the ndatabytes is < 512bytes. If
> it is larger, it allocates 4 * size. For nullbitmap, it allocates 4 *
> number of items in array.
>
> Regards,
> --
> Ali Akbar
>
>


Re: [HACKERS] make pg_controldata accept "-D dirname"

2014-10-24 Thread Michael Paquier
On Thu, Sep 25, 2014 at 12:38 PM, Heikki Linnakangas <
hlinnakan...@vmware.com> wrote:

> On 09/24/2014 05:48 PM, Abhijit Menon-Sen wrote:
>
>> Updated patches attached.
>>
>
> Thanks, applied some version of these.
>
This set of patches has been applied as b0d81ad but patch 0001 did not make
it in, so pg_controldata is not yet able to accept -D:
$ pg_controldata -D /foo/path
pg_controldata: could not open file "-D/global/pg_control" for reading: No
such file or directory
$ pg_controldata /foo/path
pg_controldata: could not open file "/foo/path/global/pg_control" for
reading: No such file or directory
-- 
Michael


Re: [HACKERS] alter user/role CURRENT_USER

2014-10-24 Thread Kyotaro HORIGUCHI
Hi, here is the revised patch.

Attached files are the followings

 - 0001-ALTER-ROLE-CURRENT_USER_v2.patch  - the patch.

 - testset.tar.bz2 - test set. Run by typing 'make check' as a
   superuser of the running postgreSQL server. It creates "testdb"
   and some roles.

Documents are not edited this time.

 

Considering your comments, I found more points to modify.

 - CREATE SCHEMA (IF NOT EXISTS) .. AUTHORIZATION  ...

 - ALTER AGGREAGE/COLLATION/etc... OWNER TO 

 - CREATE/ALTER/DROP USER MAPPING FOR  SERVER ..

GRANT/REVOKE also takes role as an arguemnt but CURRENT_USER and
the similar keywords seem to be useless for them.

Finally, the new patch modifies the following points.

In gram.y,

 - RoleId's are replaced with RoleId_or_curruser in more places.
   It accepts CURRENT_USER/USER/CURRENT_ROLE/SESSION_USER.

 - ALTER USER ALL syntax is added. (not changed from the previous patch)

 - The non-terminal auth_ident now uses RoleId_or_curruser
   instead of RoleId. This affects CREATE/ALTER/DROP USER MAPPING

In user.c, new function ResolveRoleId() is added and used for all
role ID resolutions that correspond to the syntax changes in
parser. It is AlterRole() in user.c.

In foreigncmds.c, GetUserOidFromMapping() is removed and
ResolveRoleId is used instead.

In alter.c and tablecmds.c, all calls to get_role_oid()
correspond the the grammer change were replaced with
ResolveRoleId().

The modifications are a bit complicated so I provided a
comprehensive test set.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From f55494a49b6d4c7eb32665ea9cc63634f5000c99 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 9 Sep 2014 19:26:24 +0900
Subject: [PATCH] ALTER ROLE CURRENT_USER

---
 src/backend/commands/alter.c   |  2 +-
 src/backend/commands/foreigncmds.c | 25 ++--
 src/backend/commands/schemacmds.c  |  3 +-
 src/backend/commands/tablecmds.c   |  2 +-
 src/backend/commands/user.c| 56 ++-
 src/backend/parser/gram.y  | 78 ++
 src/include/commands/dbcommands.h  |  1 +
 src/include/commands/user.h|  2 +
 8 files changed, 96 insertions(+), 73 deletions(-)

diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index c9a9baf..15f254e 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -678,7 +678,7 @@ AlterObjectNamespace_internal(Relation rel, Oid objid, Oid nspOid)
 Oid
 ExecAlterOwnerStmt(AlterOwnerStmt *stmt)
 {
-	Oid			newowner = get_role_oid(stmt->newowner, false);
+	Oid			newowner = ResolveRoleId(stmt->newowner, false, NULL);
 
 	switch (stmt->objectType)
 	{
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index ab4ed6c..9878fca 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -27,6 +27,7 @@
 #include "catalog/pg_type.h"
 #include "catalog/pg_user_mapping.h"
 #include "commands/defrem.h"
+#include "commands/user.h"
 #include "foreign/fdwapi.h"
 #include "foreign/foreign.h"
 #include "miscadmin.h"
@@ -198,24 +199,6 @@ transformGenericOptions(Oid catalogId,
 
 
 /*
- * Convert the user mapping user name to OID
- */
-static Oid
-GetUserOidFromMapping(const char *username, bool missing_ok)
-{
-	if (!username)
-		/* PUBLIC user mapping */
-		return InvalidOid;
-
-	if (strcmp(username, "current_user") == 0)
-		/* map to the owner */
-		return GetUserId();
-
-	/* map to provided user */
-	return get_role_oid(username, missing_ok);
-}
-
-/*
  * Internal workhorse for changing a data wrapper's owner.
  *
  * Allow this only for superusers; also the new owner must be a
@@ -1099,7 +1082,7 @@ CreateUserMapping(CreateUserMappingStmt *stmt)
 
 	rel = heap_open(UserMappingRelationId, RowExclusiveLock);
 
-	useId = GetUserOidFromMapping(stmt->username, false);
+	useId = ResolveRoleId(stmt->username, false, NULL);
 
 	/* Check that the server exists. */
 	srv = GetForeignServerByName(stmt->servername, false);
@@ -1194,7 +1177,7 @@ AlterUserMapping(AlterUserMappingStmt *stmt)
 
 	rel = heap_open(UserMappingRelationId, RowExclusiveLock);
 
-	useId = GetUserOidFromMapping(stmt->username, false);
+	useId = ResolveRoleId(stmt->username, false, NULL);
 	srv = GetForeignServerByName(stmt->servername, false);
 
 	umId = GetSysCacheOid2(USERMAPPINGUSERSERVER,
@@ -1276,7 +1259,7 @@ RemoveUserMapping(DropUserMappingStmt *stmt)
 	Oid			umId;
 	ForeignServer *srv;
 
-	useId = GetUserOidFromMapping(stmt->username, stmt->missing_ok);
+	useId = ResolveRoleId(stmt->username, stmt->missing_ok, NULL);
 	srv = GetForeignServerByName(stmt->servername, true);
 
 	if (stmt->username && !OidIsValid(useId))
diff --git a/src/backend/commands/schemacmds.c b/src/backend/commands/schemacmds.c
index 03f5514..4f97f23 100644
--- a/src/backend/commands/schemacmds.c
+++ b/src/backend/commands/schemacmds.c
@@ -25,6 +25,7 @@
 #include "catalog/pg_namespace.h"
 #include "commands/dbcommands.h"
 #include "com

Re: [HACKERS] Function array_agg(array)

2014-10-24 Thread Ali Akbar
Updated patch attached.

2014-10-22 20:51 GMT+07:00 Pavel Stehule :

> I agree with your proposal. I have a few comments to design:
>

> 1. patch doesn't hold documentation and regress tests, please append
> it.
>

i've added some regression tests in arrays.sql and aggregate.sql.

I've only found the documentation of array_agg. Where is the doc for
array(subselect) defined?


> 2. this functionality (multidimensional aggregation) can be interesting
> more times, so maybe some interface like array builder should be 
> preferred.
>
 We already have accumArrayResult and makeArrayResult/makeMdArrayResult,
 haven't we?

 Actually array_agg(anyarray) can be implemented by using
 accumArrayResult and makeMdArrayResult, but in the process we will
 deconstruct the array data and null bit-flags into ArrayBuildState->dvalues
 and dnulls. And we will reconstruct it through makeMdArrayResult. I think
 this way it's not efficient, so i decided to reimplement it and memcpy the
 array data and null flags as-is.

>>>
>>> aha, so isn't better to fix a performance for accumArrayResult ?
>>>
>>
>> Ok, i'll go this route. While reading the code of array(subselect) as you
>> pointed below, i think the easiest way to implement support for both
>> array_agg(anyarray) and array(subselect) is to change accumArrayResult so
>> it operates both with scalar datum(s) and array datums, with performance
>> optimization for the latter.
>>
>
implemented it by modifying ArrayBuildState to ArrayBuildStateArray and
ArrayBuildStateScalar (do you have any idea for better struct naming?)


> In other places, i think it's clearer if we just use accumArrayResult and
 makeArrayResult/makeMdArrayResult as API for building (multidimensional)
 arrays.


> 3. array_agg was consistent with array(subselect), so it should be
> fixed too
>
> postgres=# select array_agg(a) from test;
>array_agg
> ---
>  {{1,2,3,4},{1,2,3,4}}
> (1 row)
>
> postgres=# select array(select a from test);
> ERROR:  could not find array type for data type integer[]
>

 I'm pretty new in postgresql development. Can you point out where is
 array(subselect) implemented?

>>>
>>> where you can start?
>>>
>>> postgres=# explain select array(select a from test);
>>> ERROR:  42704: could not find array type for data type integer[]
>>> LOCATION:  exprType, nodeFuncs.c:116
>>>
>>> look to code ... your magic keyword is a ARRAY_SUBLINK .. search in
>>> postgresql sources this keyword
>>>
>>
> attention: probably we don't would to allow arrays everywhere.
>

I've changed the places where i think it's appropriate.


> 4. why you use a magic constant (64) there?
>
> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
> + astate->aitems = 64 * nitems;
>
> + astate->nullbitmap = (bits8 *)
> + repalloc(astate->nullbitmap, (astate->aitems + 7) /
> 8);
>

 just follow the arbitrary size choosen in accumArrayResult
 (arrayfuncs.c):
 astate->alen = 64; /* arbitrary starting array size */

 it can be any number not too small and too big. Too small, and we will
 realloc shortly. Too big, we will end up wasting memory.

>>>
>>> you can try to alloc 1KB instead as start -- it is used more times in
>>> Postgres. Then a overhead is max 1KB per agg call - what is acceptable.
>>>
>>> You take this value from accumArrayResult, but it is targeted for
>>> shorted scalars - you should to expect so any array will be much larger.
>>>
>>
this patch allocates 1KB (1024 bytes) if the ndatabytes is < 512bytes. If
it is larger, it allocates 4 * size. For nullbitmap, it allocates 4 *
number of items in array.

Regards,
--
Ali Akbar
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 12046,12051  NULL baz(3 rows)
--- 12046,12067 
   

 
+ array_agg
+
+array_agg(anyarray)
+   
+   
+any
+   
+   
+the same array type as input type
+   
+   input arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input.
+  
+ 
+  
+   
+
  average
 
 
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
***
*** 108,119  exprType(const Node *expr)
  	type = exprType((Node *) tent->expr);
  	if (sublink->subLinkType == ARRAY_SUBLINK)
  	{
! 		type = get_array_type(type);
! 		if (!OidIsValid(type))
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_OBJECT),
! 	 errmsg("could not find array type for data type %s",
! 			format_type_be(exprType((Node *) tent->expr);
  	}
  }
  else if (sublink->subLinkType == MULTIEXPR_SUBLINK)
--- 108,123 
  	t

[HACKERS] Re: Getting rid of "accept incoming network connections" prompts on OS X

2014-10-24 Thread edward745
One of the queries in ri_triggers.c has be a little baffled.

For (relatively) obvious reasons, a FK insert triggers a SELECT 1 FROM
pk_rel ... FOR KEY SHARE.
For not-so-obvious reasons, a PK delete triggers a SELECT 1 FROM fk_rel ...
FOR KEY SHARE.

I can't see what the lock on fk_rel achieves. Both operations are already
contending for the lock on the PK row, which seems like enough to cover
every eventuality.

And even if the lock serves a purpose, KEY SHARE is an odd choice, since the
referencing field is, in general, not a "key" in this sense.




-
aaa
--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Getting-rid-of-accept-incoming-network-connections-prompts-on-OS-X-tp5823819p5823890.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-24 Thread Amit Kapila
On Wed, Oct 22, 2014 at 7:12 PM, Andres Freund 
wrote:
>
> On 2014-10-22 13:32:07 +0530, Amit Kapila wrote:
> > On Tue, Oct 21, 2014 at 7:56 PM, Amit Kapila 
> > wrote:
> > > On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund 
> > wrote:
> > > >
> > > > On 2014-06-25 19:06:32 +0530, Amit Kapila wrote:
> >
> > Today, I have verified all previous comments raised by
> > me and looked at new code and below are my findings:
> >
> > >>
> > >> 4.
> > >> LWLockAcquireCommon()
> > >> {
> > >> ..
> > >> if (!LWLockDequeueSelf(l))
> > >> {
> > >>  for (;;)
> > >> {
> > >> PGSemaphoreLock(&proc->sem, false);
> > >>  if (!proc->lwWaiting)
> > >> break;
> > >> extraWaits++;
> > >>  }
> > >> lock->releaseOK = true;
> > >> ..
> > >> }
> > >>
> > >> Setting releaseOK in above context might not be required  because if
the
> > >> control comes in this part of code, it will not retry to acquire
another
> > >> time.
> >
> > > Hm. You're probably right.
> >
> > You have agreed to fix this comment, but it seems you have forgot
> > to change it.
>
> After I've thought more about it, it's is actually required. This
> essentially *is* a retry.

Won't it needs to be set before retry? Whats the use of setting it
when we have got the lock and we are not going to retry.

> Someobdy woke us up, which is where releaseOK is supposed to be set.

I think that is true only in case when we are again going to retry or
atleast that seems to be the mechanism used currently in
LWLockAcquireCommon.

>
> > >> 11.
> > >> LWLockRelease()
> > >> {
> > >> ..
> > >> PRINT_LWDEBUG("LWLockRelease", lock, mode);
> > >> }
> > >>
> > >> Shouldn't this be in begining of LWLockRelease function rather than
> > >> after processing held_lwlocks array?
> >
> > > Ok.
> >
> > You have agreed to fix this comment, but it seems you have forgot
> > to change it.
>
>
>
> > Below comment doesn't seem to be adressed?
> >
> > > LWLockAcquireOrWait()
> > > {
> > > ..
> > > LOG_LWDEBUG("LWLockAcquireOrWait", lockid, mode, "succeeded");
> > > ..
> > > }
> >
> > > a. such a log is not there in any other LWLock.. variants,
> > >  if we want to introduce it, then shouldn't it be done at
> > >  other places as well.
>
> I think you're placing unneccessarily high consistency constraints on a
> debugging feature here.

This was just a very minor suggestion to keep code consistent,
which if you want to ignore is okay.  I understand that having
or not having code consistent for this doesn't matter.

> > Below point is yet to be resolved.
> >
> > > > 12.
> > > > #ifdef LWLOCK_DEBUG
> > > > lock->owner = MyProc;
> > > > #endif
> > > >
> > > > Shouldn't it be reset in LWLockRelease?
> > >
> > > That's actually intentional. It's quite useful to know the last owner
> > > when debugging lwlock code.
> >
> > Won't it cause any problem if the last owner process exits?
>
> No. PGPROCs aren't deallocated or anything. And it's a debugging only
> variable.

Thats right, the problem I was thinking is of wrong information.
Ex. if process holding Exclusive locker has exited and then
lot of other processes took shared locks and one new Exclusive
locker waits on getting the lock, at that moment during debugging
we can get wrong information about lock owner.

However I think you are mainly worried about situtions when many
backends are waiting for Exclusive locker which is probably the
most common scenario.


> > Can you explain how pg_read_barrier() in below code makes this
> > access safe?
> >
> > LWLockWakeup()
> > {
> > ..
> > + pg_read_barrier(); /* pairs with nwaiters-- */
> > + if (!BOOL_ACCESS_ONCE(lock->releaseOK))
> > ..
> > }
>
> What's the concern you have? Full memory barriers (the atomic
> nwaiters--) pair with read memory barriers.

IIUC, then pairing with nwaiters in LWLockAcquireCommon() ensures
that releaseOK is set before again attemting for lock as atomic
operation provides the necessary barrier.  The point I am not
getting is what kind of guarantee pg_read_barrier() provides us
or why is it required?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-24 Thread Michael Paquier
On Fri, Oct 24, 2014 at 8:26 AM, Dave Page  wrote:

> On Fri, Oct 24, 2014 at 7:18 AM, Peter Eisentraut  wrote:
> > On 10/21/14 1:16 PM, Tom Lane wrote:
> >> If you do any Postgres development on OS X, you've probably gotten
> >> seriously annoyed by the way that, every single time you reinstall the
> >> postmaster executable, you get a dialog box asking whether you'd like
> >> to allow it to accept incoming network connections.
> >
> > I used to, but somehow I don't see this anymore.  Just to be sure, I
> > made sure the firewall is on, checked that postgres is not in the
> > exception list, rebooted, built postgresql from scratch, ran make check,
> > but no pop-up.
> >
> > I'm on Yosemite.  Maybe this was changed.
>
> I've never seen it on any version of OS X (I've worked my way from
> Panther to Yosemite). There must be more to it...
>
FWIW, with firewall at on, I am used to see this annoying popup window when
starting an instance manually, make check never complains though.
-- 
Michael