Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-07-20 Thread Michael Paquier
On Mon, Jul 20, 2015 at 9:59 PM, Beena Emerson  wrote:
> Simon Riggs wrote:
>
>> The choice between formats is not
>> solely predicated on whether we have multi-line support.
>
>> I still think writing down some actual use cases would help bring the
>> discussion to a conclusion. Inventing a general facility is hard without
>> some clear goals about what we need to support.
>
> We need to at least support the following:
> - Grouping: Specify of standbys along with the minimum number of commits
> required from the group.
> - Group Type: Groups can either be priority or quorum group.

As far as I understood at the lowest level a group is just an alias
for a list of nodes, quorum or priority are properties that can be
applied to a group of nodes when this group is used in the expression
to define what means synchronous commit.

> - Group names: to simplify status reporting
> - Nesting: At least 2 levels of nesting

If I am following correctly, at the first level there is the
definition of the top level objects, like groups and sync expression.

> Using JSON, sync rep parameter to replicate in 2 different clusters could be
> written as:
>
>{"remotes":
> {"quorum": 2,
>  "servers": [{"london":
> {"priority": 2,
>  "servers": ["lndn1", "lndn2", "lndn3"]
> }}
> ,
>   {"nyc":
> {"priority": 1,
>  "servers": ["ny1", "ny2"]
> }}
>   ]
> }
> }
> The same parameter in the new language (as suggested above) could be written
> as:
>  'remotes: 2(london: 1[lndn1, lndn2, lndn3], nyc: 1[ny1, ny2])'

OK, there is a typo. That's actually 2(london: 2[lndn1, lndn2, lndn3],
nyc: 1[ny1, ny2]) in your grammar. Honestly, if we want group aliases,
I think that JSON makes the most sense. One of the advantage of a
group is that you can use it in several places in the blob and set
different properties into it, hence we should be able to define a
group out of the sync expression.

Hence I would think that something like that makes more sense:
{
"sync_standby_names":
{
"quorum":2,
"nodes":
[
{"priority":1,"group":"cluster1"},
{"quorum":2,"nodes":["node1","node2","node3"]}
]
},
"groups":
{
"cluster1":["node11","node12","node13"],
"cluster2":["node21","node22","node23"]
}
}

> Also, I was thinking the name of the main group could be optional.
> Internally, it can be given the name 'default group' or 'main group' for
> status reporting.
>
> The above could also be written as:
>  '2(london: 2[lndn1, lndn2, lndn3], nyc: 1[ny1, ny2])'
>
> backward compatible:
> In JSON, while validating we may have to check if it starts with '{' to go

Something worth noticing, application_name can begin with "{".

> for JSON parsing else proceed with the current method.

> A,B,C => 1[A,B,C]. This can be added in the new parser code.

This makes sense. We could do the same for JSON-based format as well
by reusing the in-memory structure used to deparse the blob when the
former grammar is used as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security labels on databases are bad for dump & restore

2015-07-20 Thread Noah Misch
On Mon, Jul 20, 2015 at 07:01:14PM -0400, Adam Brightwell wrote:
> > Consistency with existing practice would indeed have pg_dump ignore
> > pg_shseclabel and have pg_dumpall reproduce its entries.
> 
> I think that makes sense, but what about other DATABASE level info
> such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
> specifically thinking of the discussion from the following thread:
> 
> http://www.postgresql.org/message-id/20150317172459.gm3...@alvh.no-ip.org
> 
> If COMMENT is included then why not SECURITY LABEL or others?

In any given situation, we should indeed restore both pg_database comments and
pg_database security labels, or we should restore neither.  Restoring neither
is most consistent with history, but several people like the idea of restoring
both.  I won't mind either conclusion.

> > In a greenfield, I would make "pg_dump --create" reproduce pertinent entries
> > from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I 
> > would
> > make non-creating pg_dump reproduce none of those.
> 
> I think the bigger question is "Where is the line drawn between
> pg_dump and pg_dumpall?".  At what point does one tool become the
> other?

That question may be too big for me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "make check" changes have caused buildfarm deterioration.

2015-07-20 Thread Michael Paquier
On Tue, Jul 21, 2015 at 2:39 PM, Michael Paquier
 wrote:
> On Tue, Jul 21, 2015 at 6:17 AM, Tom Lane  wrote:
>> Andrew Dunstan  writes:
>>> Somewhere along the way some changes to the way we do "make check" have
>>> caused a significant deterioration in the buildfarm's logging. Compare
>>> these two from animal crake, which happens to be my test instance:
>>> 
>>> and
>>> 
>>
>> Yeah, I've been bitching about the poor logging for awhile, but I had
>> not realized that it's still working as-expected in the back branches.
>>
>> Comparing different branches, it looks like "somewhere along the way"
>> means "between 9.4 and 9.5".  I suspect that commit dcae5faccab64776, or
>> perhaps the followon dbf2ec1a1c053379, is to blame.
>
> Regarding install.log, the use of stdout/stderr instead of a log file
> has been changed in dbf2ec1a after that:
> http://www.postgresql.org/message-id/553fe7fc.2040...@gmx.net
> Since 9.5 as the location of the temporary installation is global, we
> could basically revert some parts of dcae5fac if that helps so as
> install.log is saved in $ROOT/tmp_install/log/install.log... But I am
> not sure what we win with that, and the argument to remove install.log
> is that now the temporary installation is a make target. Both ways
> have advantages and disadvantages.
>
> Regarding initdb.log and postmaster.log, this is definitely a bug.
> Those have been moved by dcae5fa from log/ to tmp_check/log/,
> tmp_check/ getting removed at the end of pg_regress if there are no
> failures counted. Both files will be saved in log/ at the location
> pg_regress is called using outputdir whose default is ".". This way
> behavior is similar to ~9.4. Attached is a patch to fix this for 9.5
> and master.

Something I just noticed: an entry for log/ in test_ddl_deparse's
gitignore is missing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "make check" changes have caused buildfarm deterioration.

2015-07-20 Thread Michael Paquier
On Tue, Jul 21, 2015 at 6:17 AM, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> Somewhere along the way some changes to the way we do "make check" have
>> caused a significant deterioration in the buildfarm's logging. Compare
>> these two from animal crake, which happens to be my test instance:
>> 
>> and
>> 
>
> Yeah, I've been bitching about the poor logging for awhile, but I had
> not realized that it's still working as-expected in the back branches.
>
> Comparing different branches, it looks like "somewhere along the way"
> means "between 9.4 and 9.5".  I suspect that commit dcae5faccab64776, or
> perhaps the followon dbf2ec1a1c053379, is to blame.

Regarding install.log, the use of stdout/stderr instead of a log file
has been changed in dbf2ec1a after that:
http://www.postgresql.org/message-id/553fe7fc.2040...@gmx.net
Since 9.5 as the location of the temporary installation is global, we
could basically revert some parts of dcae5fac if that helps so as
install.log is saved in $ROOT/tmp_install/log/install.log... But I am
not sure what we win with that, and the argument to remove install.log
is that now the temporary installation is a make target. Both ways
have advantages and disadvantages.

Regarding initdb.log and postmaster.log, this is definitely a bug.
Those have been moved by dcae5fa from log/ to tmp_check/log/,
tmp_check/ getting removed at the end of pg_regress if there are no
failures counted. Both files will be saved in log/ at the location
pg_regress is called using outputdir whose default is ".". This way
behavior is similar to ~9.4. Attached is a patch to fix this for 9.5
and master.
Regards,
-- 
Michael
From d113a885a2353da30a37424a3fed94b72922cdd0 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 21 Jul 2015 14:27:59 +0900
Subject: [PATCH] Fix location of output logs of pg_regress

initdb.log and postmaster.log have been moved in the temporary instance
path by default tmp_check/log by dcae5fac, which gets removed at the end
of the run of pg_regress when there are no failures found. This makes
difficult analysis of after-run failures in some cases, and reduces the
output verbosity of buildfarm after a run.

Per report from Andrew Dunstan.
---
 src/test/regress/pg_regress.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index ed8c369..dd65ab5 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -2207,7 +2207,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 		make_directory(temp_instance);
 
 		/* and a directory for log files */
-		snprintf(buf, sizeof(buf), "%s/log", temp_instance);
+		snprintf(buf, sizeof(buf), "%s/log", outputdir);
 		if (!directory_exists(buf))
 			make_directory(buf);
 
@@ -2220,10 +2220,10 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
  temp_instance,
  debug ? " --debug" : "",
  nolocale ? " --no-locale" : "",
- temp_instance);
+ outputdir);
 		if (system(buf))
 		{
-			fprintf(stderr, _("\n%s: initdb failed\nExamine %s/log/initdb.log for the reason.\nCommand was: %s\n"), progname, temp_instance, buf);
+			fprintf(stderr, _("\n%s: initdb failed\nExamine %s/log/initdb.log for the reason.\nCommand was: %s\n"), progname, outputdir, buf);
 			exit(2);
 		}
 
@@ -2324,7 +2324,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
  bindir ? "/" : "",
  temp_instance, debug ? " -d 5" : "",
  hostname ? hostname : "", sockdir ? sockdir : "",
- temp_instance);
+ outputdir);
 		postmaster_pid = spawn_process(buf);
 		if (postmaster_pid == INVALID_PID)
 		{
@@ -2353,7 +2353,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 			if (WaitForSingleObject(postmaster_pid, 0) == WAIT_OBJECT_0)
 #endif
 			{
-fprintf(stderr, _("\n%s: postmaster failed\nExamine %s/log/postmaster.log for the reason\n"), progname, temp_instance);
+fprintf(stderr, _("\n%s: postmaster failed\nExamine %s/log/postmaster.log for the reason\n"), progname, outputdir);
 exit(2);
 			}
 
@@ -2361,7 +2361,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 		}
 		if (i >= 60)
 		{
-			fprintf(stderr, _("\n%s: postmaster did not respond within 60 seconds\nExamine %s/log/postmaster.log for the reason\n"), progname, temp_instance);
+			fprintf(stderr, _("\n%s: postmaster did not respond within 60 seconds\nExamine %s/log/postmaster.log for the reason\n"), progname, outputdir);
 
 			/*
 			 * If we get here, the postmaster is probably wedged somewhere in
-- 
2.4.6


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your s

Re: [HACKERS] A little RLS oversight?

2015-07-20 Thread Michael Paquier
On Tue, Jul 14, 2015 at 4:01 AM, Stephen Frost  wrote:
> Michael,
>
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> On Sun, Jul 12, 2015 at 5:59 PM, Yaroslav wrote:
>> > I can still see all statistics for 'test' in pg_stats under unprivileged
>> > user.
>>
>> Indeed, this looks like an oversight of RLS. Even if a policy is
>> defined to prevent a user from seeing the rows of other users, it is
>> still possible to get some information though this view.
>> I am adding an open item regarding that for 9.5.
>
> We need to be careful to avoid the slippery slope of trying to prevent
> all covert channels, which has been extensively discussed previously.  I
> tend to agree with this specific case of, if you have RLS configured on
> the table then we probably shouldn't allow normal users to see the stats
> on the table, but I don't have a problem with the usage of those stats
> for generating plans, which users could see the results of via EXPLAIN.

You mean for example the case where EXPLAIN adds in its output the
number of rows filtered out for all users? This gives an hint about
the number of rows of a relation even if a user that a limited access
to its rows with a policy.

>> > I'd prefer statistics on RLS-enabled tables to be simply hidden completely
>> > for unprivileged users.
>>
>> This looks like something simple enough to do.
>> @Stephen: perhaps you have some thoughts on the matter? Currently
>> pg_stats breaks its promise to only show information about the rows
>> current user can read.
>
> I agree that it should be reasonably simple to do and, provided that's
> the case, I'm fine with doing it once I get back (currently out until
> the 27th).

Looking at that I am not seeing any straight-forward way to resolve
this issue except by hardening pg_stats by having an additional filter
of this type so as a non-owner of a relation cannot see the stats of
this table directly when RLS is enabled:
c.relrowsecurity = false OR c.relowner = current_user::regrole::oid
Attached is a patch doing that (/me now hides, expecting to receive
laser shots because of the use of current_user on a system view).
Thoughts?
-- 
Michael
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e82a53a..3ecf948 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -211,7 +211,10 @@ CREATE VIEW pg_stats AS
 FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
  JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
  LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
-WHERE NOT attisdropped AND has_column_privilege(c.oid, a.attnum, 'select');
+WHERE NOT attisdropped AND
+  has_column_privilege(c.oid, a.attnum, 'select') AND
+  (c.relrowsecurity = false OR
+   c.relowner = current_user::regrole::oid);
 
 REVOKE ALL on pg_statistic FROM public;
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index cd53375..67aa14a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2061,7 +2061,7 @@ pg_stats| SELECT n.nspname AS schemaname,
  JOIN pg_class c ON ((c.oid = s.starelid)))
  JOIN pg_attribute a ON (((c.oid = a.attrelid) AND (a.attnum = s.staattnum
  LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
-  WHERE ((NOT a.attisdropped) AND has_column_privilege(c.oid, a.attnum, 'select'::text));
+  WHERE ((NOT a.attisdropped) AND has_column_privilege(c.oid, a.attnum, 'select'::text) AND ((c.relrowsecurity = false) OR (c.relowner = (("current_user"())::regrole)::oid)));
 pg_tables| SELECT n.nspname AS schemaname,
 c.relname AS tablename,
 pg_get_userbyid(c.relowner) AS tableowner,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-20 Thread Michael Paquier
On Mon, Jul 20, 2015 at 10:20 PM, Petr Jelinek  wrote:
> On 2015-07-19 17:16, Michael Paquier wrote:
>>
>> On Sat, Jul 18, 2015 at 8:00 AM, Petr Jelinek 
>> wrote:
>>>
>>> On 2015-07-15 06:07, Michael Paquier wrote:


 On Fri, Jul 10, 2015 at 11:28 PM, Tom Lane  wrote:
>
>
> Andres Freund  writes:
>>
>>
>> On July 10, 2015 4:16:59 PM GMT+02:00, Tom Lane 
>>
>> wrote:
>>>
>>>
>>> Would that propagate down through multiple levels of CASCADE?
>>> (Although
>>> I'm not sure it would be sensible for a non-relocatable extension to
>>> depend on a relocatable one, so maybe the need doesn't arise in
>>> practice.)
>
>
>
>> I'd day so. I was thinking it'd adding a flag that allows to pass a
>> schema to a non relocatable extension. That'd then be passed down. I
>> agree that it's unlikely to be used often.
>
>
>
> Yeah, I was visualizing it slightly differently, as a separate
> internal-only option "schema_if_needed", but it works out to the
> same thing either way.
>>>
>>>
>>>
>>> I added DEFAULT SCHEMA option into CREATE EXTENSION which behaves like
>>> SCHEMA but only for extension that don't specify their schema and is
>>> mutually exclusive with just SCHEMA. The DEFAULT SCHEMA propagates when
>>> CASCADE is used while the SCHEMA option does not propagate. I'd like to
>>> hear
>>> opinions about this behavior. It would be possible to propagate SCHEMA as
>>> DEFAULT SCHEMA but that might have surprising results (installing
>>> dependencies in same schema as the current ext).
>>
>>
>> Hm...
>>
>> First, the current patch has a surprising behavior because it fails to
>> create an extension in cascade when creation is attempted on a custom
>> schema:
>> =# create schema po;
>> CREATE SCHEMA
>> =# create extension hstore_plperl with schema po cascade;
>> NOTICE:  0: installing required extension "hstore"
>> LOCATION:  CreateExtension, extension.c:1483
>> NOTICE:  0: installing required extension "plperl"
>> LOCATION:  CreateExtension, extension.c:1483
>> ERROR:  42704: type "hstore" does not exist
>> LOCATION:  typenameType, parse_type.c:258
>> Time: 30.071 ms
>> To facilitate the life of users, I think that the parent extensions
>> should be created on the same schema as its child by default. In this
>> case hstore should be created in po, not public.
>>
>
> That's actually a bug because the previous version of the patch did not set
> the reqext correctly after creating the required extension.
>
>>
>> Hence, as a schema can only be specified in a control file for a
>> non-relocatable extension, I think that the schema name propagated to
>> the root extensions should be the one specified in the control file of
>> the child if it is defined in it instead of the one specified by user
>> (imagine for example an extension created in cascade that requires
>> adminpack, extension that can only be deployed in pg_catalog), and
>> have the root node use its own schema if it has one in its control
>> file by default.
>>
>> For example in this case:
>> foo1 (WITH SCHEMA hoge) -> foo2 (schema = pg_catalog) -> foo2_1
>>  |
>>  |--> foo2_2
>>  ---> foo3 (no schema)
>> With this command:
>> CREATE EXTENSION foo1 WITH SCHEMA hoge;
>> foo3 is created on schema po. foo2, as well its required dependencies
>> foo2_1 and foo2_2 are created on pg_catalog.
>>
>> Now DEFAULT SCHEMA is trying to achieve: "Hey, I want to define foo2_1
>> and foo2_2 on schema hoge". This may be worth achieving (though IMO I
>> think that foo1 should have direct dependencies with foo2_1 and
>> foo2_2), but I think that we should decide what is the default
>> behavior we want, and implement the additional behavior in a second
>> patch, separated from the patch of this thread. Personally I am more a
>> fan of propagating to parent extensions the schema of the child and
>> not of its grand-child by default, but I am not alone here :)
>>
>
> This is something that I as a user of the feature specifically don't want to
> happen as I install extensions either to common schema (for some utility
> ones) or into separate schemas (for the rest), but I never want the utility
> extension to go to the same schema as the other ones. This is especially
> true when installing non-relocatable extension which depends on relocatable
> one. Obviously, it does not mean that nobody else wants this though :)

So, in your patch the default behavior is to create parent extensions
on schema public if the child extension created specifies a schema:
=# create schema po;
CREATE SCHEMA
=# create extension test_ext2 cascade schema po;
CREATE EXTENSION
=# \dx  test_ext*
  List of installed extensions
   Name| Version | Schema |   Description
---+-++--
 test_ext2 | 1.0 | po |

Re: [HACKERS] Unnecessary #include in objectaddress.h?

2015-07-20 Thread Adam Brightwell
> I wondered whether to bother about this kind of thing for a while.  It
> doesn't have any practical impact immediately, because obviously
> pg_list.h is still included indirectly by objectaddress.h (via lock.h in
> this case IIRC).  If we made some restructuring that caused the other
> header not to include pg_list.h anymore, that would make objectaddress.h
> broken -- unless objectaddress.h itself no longer needed pg_list.h.
>
> We've had in previous rounds whole iterations on a "pgrminclude" script
> that does this kind of thing, but the breakage after each such run is
> large.
>
> All in all, I wouldn't bother unless there is an actual change.

Understood.  Thanks.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security labels on databases are bad for dump & restore

2015-07-20 Thread Adam Brightwell
> Consistency with existing practice would indeed have pg_dump ignore
> pg_shseclabel and have pg_dumpall reproduce its entries.

I think that makes sense, but what about other DATABASE level info
such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
specifically thinking of the discussion from the following thread:

http://www.postgresql.org/message-id/20150317172459.gm3...@alvh.no-ip.org

If COMMENT is included then why not SECURITY LABEL or others?

> In a greenfield, I would make "pg_dump --create" reproduce pertinent entries
> from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I would
> make non-creating pg_dump reproduce none of those.

I think the bigger question is "Where is the line drawn between
pg_dump and pg_dumpall?".  At what point does one tool become the
other?

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unnecessary #include in objectaddress.h?

2015-07-20 Thread Alvaro Herrera
Adam Brightwell wrote:
> All,
> 
> While looking at the include dependency graph for objectaddress.h:
> 
> http://doxygen.postgresql.org/objectaddress_8h.html
> 
> I saw that pg_list.h is both included and inherited (through multiple
> paths) by objectaddress.h.  Perhaps this doesn't matter, but I thought
> I would at least bring it up and propose removing this redundant
> #include from objectaddress.h.

I wondered whether to bother about this kind of thing for a while.  It
doesn't have any practical impact immediately, because obviously
pg_list.h is still included indirectly by objectaddress.h (via lock.h in
this case IIRC).  If we made some restructuring that caused the other
header not to include pg_list.h anymore, that would make objectaddress.h
broken -- unless objectaddress.h itself no longer needed pg_list.h.

We've had in previous rounds whole iterations on a "pgrminclude" script
that does this kind of thing, but the breakage after each such run is
large.

All in all, I wouldn't bother unless there is an actual change.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "make check" changes have caused buildfarm deterioration.

2015-07-20 Thread Tom Lane
Andrew Dunstan  writes:
> Somewhere along the way some changes to the way we do "make check" have 
> caused a significant deterioration in the buildfarm's logging. Compare 
> these two from animal crake, which happens to be my test instance: 
> 
>  
> and 
> 
>  

Yeah, I've been bitching about the poor logging for awhile, but I had
not realized that it's still working as-expected in the back branches.

Comparing different branches, it looks like "somewhere along the way"
means "between 9.4 and 9.5".  I suspect that commit dcae5faccab64776, or
perhaps the followon dbf2ec1a1c053379, is to blame.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unnecessary #include in objectaddress.h?

2015-07-20 Thread Adam Brightwell
All,

While looking at the include dependency graph for objectaddress.h:

http://doxygen.postgresql.org/objectaddress_8h.html

I saw that pg_list.h is both included and inherited (through multiple
paths) by objectaddress.h.  Perhaps this doesn't matter, but I thought
I would at least bring it up and propose removing this redundant
#include from objectaddress.h.

If it makes sense to do so, I have attached a patch that removes it.

Thanks,
Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
diff --git a/src/include/catalog/objectaddress.h b/src/include/catalog/objectaddress.h
new file mode 100644
index 37808c0..432cbe8
*** a/src/include/catalog/objectaddress.h
--- b/src/include/catalog/objectaddress.h
***
*** 13,19 
  #ifndef OBJECTADDRESS_H
  #define OBJECTADDRESS_H
  
- #include "nodes/pg_list.h"
  #include "storage/lock.h"
  #include "utils/acl.h"
  #include "utils/relcache.h"
--- 13,18 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] "make check" changes have caused buildfarm deterioration.

2015-07-20 Thread Andrew Dunstan
Somewhere along the way some changes to the way we do "make check" have 
caused a significant deterioration in the buildfarm's logging. Compare 
these two from animal crake, which happens to be my test instance: 
 
and 
 



In the first place, we now have all that extraneous installation logging 
at the top, and the stuff we are most likely to be interested in right 
at the bottom. But more importantly, we are now missing the initdb log 
and the postmaster log, and the function to get a stack trace if 
required is almost certainly going to be broken as well.


What's most annoying is that some of the  stuff for this hasn't just 
moved, but some is now apparently now cleaned up and removed so the 
buildfarm script can't get at it at all.


I'm going to look at what can be done to repair the damage. No doubt I 
should have been paying more attention, but sometimes I just can't keep 
track of everything going on. Fixing this will almost certainly involve 
some core changes, though.


Incidentally, this has real consequences: I just went looking to find 
the postmaster log of a case that had an error and it was missing - 
that's how I noticed this.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] All-zero page in GIN index causes assertion failure

2015-07-20 Thread Heikki Linnakangas

On 07/20/2015 07:11 PM, Alvaro Herrera wrote:

Heikki Linnakangas wrote:


Looking closer, heap vacuum does a similar thing, but there are two
mitigating factors that make it safe in practice, I think:

1. An empty heap page is all-zeroes except for the small page header in the
beginning of the page. For a torn page to matter, the page would need to be
torn in the header, but we rely elsewhere (control file) anyway that a
512-byte sector update is atomic, so that shouldn't happen. Note that this
hinges on the fact that there is no special area on heap pages, so you
cannot rely on this for index pages.

2. The redo of the first insert/update on a heap page will always
re-initialize the page, even when full-page-writes are disabled. This is the
XLOG_HEAP_INIT_PAGE optimization. So it's not just an optimization, it's
required for correctness.


Hmm, I guess this is a case of an optimization hiding a bug :-(  I mean,
if we didn't have that, we would have noticed this a lot sooner, I
imagine.


Yeah, probably.

I came up with the attached, for GIN and SP-GiST. Instead of WAL-logging 
the page initialization in SP-GiST vacuum, I changed it so that it 
simply leaves the page as all-zeroes, and adds it to the FSM. The code 
that grabs a target page from the FSM handles that, and initializes the 
page anyway, so that was simpler. This made the SP-GiST is-deleted flag 
obsolete, it's no longer set, although the code still checks for it for 
backwards-compatibility. (even that may actually be unnecessary, as a 
page that's marked as deleted must also be empty, and wherever we check 
for the deleted-flag, we also check for PageIsEmpty()))


- Heikki

diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index eba572b..1315762 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -710,7 +710,7 @@ ginvacuumcleanup(PG_FUNCTION_ARGS)
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
 
-		if (GinPageIsDeleted(page))
+		if (PageIsNew(page) || GinPageIsDeleted(page))
 		{
 			Assert(blkno != GIN_ROOT_BLKNO);
 			RecordFreeIndexPage(index, blkno);
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index dc69d1e..36e93ad 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -621,14 +621,10 @@ spgvacuumpage(spgBulkDeleteState *bds, BlockNumber blkno)
 	{
 		/*
 		 * We found an all-zero page, which could happen if the database
-		 * crashed just after extending the file.  Initialize and recycle it.
+		 * crashed just after extending the file.  Recycle it.
 		 */
-		SpGistInitBuffer(buffer, 0);
-		SpGistPageSetDeleted(page);
-		/* We don't bother to WAL-log this action; easy to redo */
-		MarkBufferDirty(buffer);
 	}
-	else if (SpGistPageIsDeleted(page))
+	else if (PageIsEmpty(page))
 	{
 		/* nothing to do */
 	}
@@ -659,25 +655,18 @@ spgvacuumpage(spgBulkDeleteState *bds, BlockNumber blkno)
 	 */
 	if (!SpGistBlockIsRoot(blkno))
 	{
-		/* If page is now empty, mark it deleted */
-		if (PageIsEmpty(page) && !SpGistPageIsDeleted(page))
-		{
-			SpGistPageSetDeleted(page);
-			/* We don't bother to WAL-log this action; easy to redo */
-			MarkBufferDirty(buffer);
-		}
-
-		if (SpGistPageIsDeleted(page))
+		if (PageIsEmpty(page))
 		{
 			RecordFreeIndexPage(index, blkno);
 			bds->stats->pages_deleted++;
 		}
 		else
+		{
+			SpGistSetLastUsedPage(index, buffer);
 			bds->lastFilledBlock = blkno;
+		}
 	}
 
-	SpGistSetLastUsedPage(index, buffer);
-
 	UnlockReleaseBuffer(buffer);
 }
 
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 413f71e..48dadd5 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -48,14 +48,14 @@ typedef SpGistPageOpaqueData *SpGistPageOpaque;
 
 /* Flag bits in page special space */
 #define SPGIST_META			(1<<0)
-#define SPGIST_DELETED		(1<<1)
+#define SPGIST_DELETED		(1<<1)		/* never set, but keep for backwards
+		 * compatibility */
 #define SPGIST_LEAF			(1<<2)
 #define SPGIST_NULLS		(1<<3)
 
 #define SpGistPageGetOpaque(page) ((SpGistPageOpaque) PageGetSpecialPointer(page))
 #define SpGistPageIsMeta(page) (SpGistPageGetOpaque(page)->flags & SPGIST_META)
 #define SpGistPageIsDeleted(page) (SpGistPageGetOpaque(page)->flags & SPGIST_DELETED)
-#define SpGistPageSetDeleted(page) (SpGistPageGetOpaque(page)->flags |= SPGIST_DELETED)
 #define SpGistPageIsLeaf(page) (SpGistPageGetOpaque(page)->flags & SPGIST_LEAF)
 #define SpGistPageStoresNulls(page) (SpGistPageGetOpaque(page)->flags & SPGIST_NULLS)
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread Merlin Moncure
On Mon, Jul 20, 2015 at 10:06 AM, Paul A Jungwirth
 wrote:
>> The above implementation of "first" aggregate returns the first non-NULL item
>> value.
>
> I'm curious what advantages this approach has over these FIRST/LAST
> functions from the Wiki?:
>
> https://wiki.postgresql.org/wiki/First/last_%28aggregate%29
>
> Also to get the "first non-null value" you can apply an ordering to
> just the aggregate function, e.g.:
>
> select first(id order by start_time nulls last) from events;
>
> If you want speed you should probably write a C version.

C functions come with a lot of administration headaches, and the
performance gain will probably not be significant unless you totally
bypass the SPI interface.   Even then, I suspect (vs the pl/pgsql
variant which caches plan) the majority of overhead is is in calling
the function, not the actual implementation.  It's be interesting to
see the results though.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] object_classes array is broken, again

2015-07-20 Thread Tom Lane
I wrote:
> +1 to this patch, in fact I think we could remove MAX_OCLASS altogether
> which would be very nice for switch purposes.

Oh, wait, I forgot that the patch itself introduces another reference to
MAX_OCLASS.  I wonder though if we should get rid of that as an enum value
in favor of #define'ing a LAST_OCLASS macro referencing the last enum
item, or some other trick like that.  It's certainly inconvenient in
event_trigger.c to have a phony member of the enum.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] object_classes array is broken, again

2015-07-20 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> What about adding StaticAsserts that lengthof() the relevant constant
>> arrays is equal to MAX_OCLASS?  (Or other similar ways of checking
>> that they have the right number of entries.)

> Well, the array itself is declared like this:
>   static const Oid object_classes[MAX_OCLASS] = {
> so testing lengthof() of it is useless because it's a constant and the
> assertion always holds.  If it were declared like this instead:
>   static const Oid object_classes[] = {
> then we could use lengthof().

Ah.  I think the point of using MAX_OCLASS there was to get a warning
if the array was too short, but evidently it doesn't work like that.

> I don't see any drawwbacks to that.

+1 to this patch, in fact I think we could remove MAX_OCLASS altogether
which would be very nice for switch purposes.

Are there any other arrays that need such tests?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dead code in Create/RenameRole() after RoleSpec changes related to CURRENT/SESSION_USER

2015-07-20 Thread Alvaro Herrera
Jeevan Chalke wrote:

> I found some dead code in CREATE/RENAME ROLE code path.
> 
> We have introduced RoleSpec and handled public and none role names in
> grammar
> itself. We do have these handling in CreateRole() and RenameRole()
> which is NO more valid now.

Right.


> Attached patch to remove those.

Pushed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in comment in setrefs.c

2015-07-20 Thread Alvaro Herrera
Etsuro Fujita wrote:
> I ran into a typo in a comment in setrefs.c.  Patch attached.

Fixed by Heikki in 7845db2aa.




-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] object_classes array is broken, again

2015-07-20 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Any opinions on this idea?  I don't like it all that much, but it's
> > plenty effective.
> 
> I don't like it much either.
> 
> What about adding StaticAsserts that lengthof() the relevant constant
> arrays is equal to MAX_OCLASS?  (Or other similar ways of checking
> that they have the right number of entries.)

Well, the array itself is declared like this:
static const Oid object_classes[MAX_OCLASS] = {
so testing lengthof() of it is useless because it's a constant and the
assertion always holds.  If it were declared like this instead:
static const Oid object_classes[] = {
then we could use lengthof().

I don't see any drawwbacks to that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index c1212e9..de2e2f2 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -126,7 +126,7 @@ typedef struct
  * This constant table maps ObjectClasses to the corresponding catalog OIDs.
  * See also getObjectClass().
  */
-static const Oid object_classes[MAX_OCLASS] = {
+static const Oid object_classes[] = {
 	RelationRelationId,			/* OCLASS_CLASS */
 	ProcedureRelationId,		/* OCLASS_PROC */
 	TypeRelationId,/* OCLASS_TYPE */
@@ -158,7 +158,8 @@ static const Oid object_classes[MAX_OCLASS] = {
 	DefaultAclRelationId,		/* OCLASS_DEFACL */
 	ExtensionRelationId,		/* OCLASS_EXTENSION */
 	EventTriggerRelationId,		/* OCLASS_EVENT_TRIGGER */
-	PolicyRelationId			/* OCLASS_POLICY */
+	PolicyRelationId,			/* OCLASS_POLICY */
+	TransformRelationId			/* OCLASS_TRANSFORM */
 };
 
 
@@ -2037,6 +2038,12 @@ add_object_address(ObjectClass oclass, Oid objectId, int32 subId,
 {
 	ObjectAddress *item;
 
+	/*
+	 * Make sure object_classes is kept up to date with the ObjectClass enum.
+	 */
+	StaticAssertStmt((lengthof(object_classes) == MAX_OCLASS),
+	 "object_classes[] must cover all ObjectClasses");
+
 	/* enlarge array if needed */
 	if (addrs->numrefs >= addrs->maxrefs)
 	{
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5da18c2..138788e 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -112,7 +112,7 @@ typedef struct ObjectAddresses ObjectAddresses;
 
 /*
  * This enum covers all system catalogs whose OIDs can appear in
- * pg_depend.classId or pg_shdepend.classId.
+ * pg_depend.classId or pg_shdepend.classId.  See object_classes[].
  */
 typedef enum ObjectClass
 {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] All-zero page in GIN index causes assertion failure

2015-07-20 Thread Alvaro Herrera
Heikki Linnakangas wrote:

> Looking closer, heap vacuum does a similar thing, but there are two
> mitigating factors that make it safe in practice, I think:
> 
> 1. An empty heap page is all-zeroes except for the small page header in the
> beginning of the page. For a torn page to matter, the page would need to be
> torn in the header, but we rely elsewhere (control file) anyway that a
> 512-byte sector update is atomic, so that shouldn't happen. Note that this
> hinges on the fact that there is no special area on heap pages, so you
> cannot rely on this for index pages.
> 
> 2. The redo of the first insert/update on a heap page will always
> re-initialize the page, even when full-page-writes are disabled. This is the
> XLOG_HEAP_INIT_PAGE optimization. So it's not just an optimization, it's
> required for correctness.

Hmm, I guess this is a case of an optimization hiding a bug :-(  I mean,
if we didn't have that, we would have noticed this a lot sooner, I
imagine.


> BRIN update is not quite right, however. brin_getinsertbuffer() can
> initialize a page, but the caller might bail out without using the page and
> WAL-logging the change. If that happens, the next update that uses the same
> page will WAL-log the change but it will not use the XLOG_BRIN_INIT_PAGE
> option, and redo will not initialize the page. Redo will fail.

Hmm, interesting.

> BTW, shouldn't there be a step in BRIN vacuum that scans all the BRIN pages?
> If an empty page is missing from the FSM for any reason, there's nothing to
> add it there.

Probably, yeah, makes sense.  If you recall, the original design I
proposed had a scan of the whole index (though for other reasons I am
thankful we got rid of that).

> This is all very subtle.

No kidding ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread Marko Tiikkaja

On 7/20/15 6:02 PM, Corey Huinker wrote:

By using only(a.name_of_the_thing) we'd have a bit more clarity that the
author expected all of those values to be the same across the aggregate
window, and discovering otherwise was reason enough to fail the query.

*IF* we're considering adding these to core, I think that only() would be
just a slight modification of the last() implementation, and could be done
at the same time.

[1] I don't care what it gets named. I just want the functionality.


A big +1 from me.  In fact, I wrote a patch implementing this for 9.5 
but never got around to finishing it.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread Corey Huinker
On Mon, Jul 20, 2015 at 11:06 AM, Paul A Jungwirth <
p...@illuminatedcomputing.com> wrote:

> > The above implementation of "first" aggregate returns the first non-NULL
> item
> > value.
>
> I'm curious what advantages this approach has over these FIRST/LAST
> functions from the Wiki?:
>
> https://wiki.postgresql.org/wiki/First/last_%28aggregate%29
>
> Also to get the "first non-null value" you can apply an ordering to
> just the aggregate function, e.g.:
>
> select first(id order by start_time nulls last) from events;
>
> If you want speed you should probably write a C version.
>
> Is there something I'm missing?
>
> Also since we're on the hackers list is this a proposal to add these
> functions to core Postgres?
>
> Yours,
> Paul
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


If it is a proposal to add to core, I'd like to suggest a close cousin
function of first()/last():  only(). [1]

It would behave like first() but would throw an error if it encountered
more than one distinct value in the window.

This would be helpful in dependent grouping situations like this:
select a.keyval, a.name_of_the thing, sum(b.metric_value) as
metric_value
from a
join b on b.a_keyval = a.keyval
group by a.keyval, a.name_of_the_thing

Now, everyone's made this optimization to reduce group-by overhead:
select a.keyval, min(a.name_of_the_thing) as name_of_the_thing,
sum(b.metric_value) as metric_value
from a
join b on b.a_keyval = a.keyval
group by a.keyval

Which works fine, but it's self-anti-documenting:
- it implies that name of the thing *could* be different across rows
with the same keyval
- it implies we have some business preference for names that are first
in alphabetical order.
- it implies that the string has more in common with the summed metrics
(imagine this query has dozens of them) than the key values to the left.

Using first(a.name_of_the_thing) is less overhead than min()/max(), but has
the same issues listed above.

By using only(a.name_of_the_thing) we'd have a bit more clarity that the
author expected all of those values to be the same across the aggregate
window, and discovering otherwise was reason enough to fail the query.

*IF* we're considering adding these to core, I think that only() would be
just a slight modification of the last() implementation, and could be done
at the same time.

[1] I don't care what it gets named. I just want the functionality.


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-20 Thread Emre Hasegeli
>> to handle DROP dependency behaviors properly.  (On reflection, maybe
>> better if it's "bernoulli(internal) returns tablesample_handler",
>> so as to guarantee that such a function doesn't create a conflict with
>> any user-defined function of the same name.)
>
> The probability of conflict seems high with the system() so yeah we'd need
> some kind of differentiator.

Maybe it would be even better to have something like bernoulli(tablesample)
where "tablesample" defined as pseudo-type like "trigger".


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-20 Thread Tom Lane
Petr Jelinek  writes:
> On 2015-07-19 22:56, Tom Lane wrote:
>> * I specified that omitting NextSampleBlock is allowed and causes the
>> core code to do a standard seqscan, including syncscan support, which
>> is a behavior that's impossible with the current API.  If we fix
>> the bernoulli code to have history-independent sampling behavior,
>> I see no reason that syncscan shouldn't be enabled for it.

> Umm, we were actually doing syncscan as well before.

Doesn't look like it to me: heap_beginscan_sampling always passes
allow_sync = false to heap_beginscan_internal.

More to the point, the existing coding of all these methods is such
that they would fail to be reproducible if syncscan were enabled,
because the scan start point wouldn't be the same.  That's fixable,
but it'll take more work than just dealing with the above oversight.
In any case, given that Simon has stated he wants support for sample
methods that don't try to be reproducible, we need the method
to be able to say whether syncscan is allowed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: enhanced DROP POLICY tab complete

2015-07-20 Thread Pavel Stehule
Thank you very much

Pavel

2015-07-20 15:39 GMT+02:00 Alvaro Herrera :

> Pavel Stehule wrote:
> > Hi
> >
> > I am sending trivial patch, that enforce more precious tab complete for
> > DROP POLICY statement
>
> Thanks, pushed and preciousified.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread Paul A Jungwirth
> The above implementation of "first" aggregate returns the first non-NULL item
> value.

I'm curious what advantages this approach has over these FIRST/LAST
functions from the Wiki?:

https://wiki.postgresql.org/wiki/First/last_%28aggregate%29

Also to get the "first non-null value" you can apply an ordering to
just the aggregate function, e.g.:

select first(id order by start_time nulls last) from events;

If you want speed you should probably write a C version.

Is there something I'm missing?

Also since we're on the hackers list is this a proposal to add these
functions to core Postgres?

Yours,
Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread Merlin Moncure
On Mon, Jul 20, 2015 at 8:40 AM, sudalai  wrote:
>
>>I don't think so, because arrays can contain duplicates.
>
>I just add two element to the array.  One for INITCOND value NULL, second
> for first row value.
> So Array size is always 2.  So no duplicates.
>
>>rhaas=# select coalesce(first(x.column1), 'wrong') from (values
>>(null), ('correct')) x;
>  >coalesce
>>--
>> wrong
>>(1 row)
> It works correct..
> I didn't said it returns, first non-null value for a column  from aggregate
> window.
>  I said my implementation returns first row value  for a column.
> Here first row element is "null ", hence it returns null.
>
>
> check this
> db=# select
> db-# coalesce(first(x.column1),'null') as col1 ,
> db-# coalesce(first(x.column2),'null') as col2,
> db-#  coalesce(first(x.column3),'null') as col3
> db-# from (values (null,'abc',null), ('correct','wrong','notsure'),
> ('second','second1','second3')) x
> db-# ;
>  col1 | col2 | col3
> --+--+--
>  null | abc  | null
> (1 row)
>
> Its work correct. It returns first row value for a column.

I was able to get ~45% runtime reduction by simply converting
"two_value_holder" from sql to plpgsql.   SQL functions (unlike
pl/pgsql) are parsed and planned every time they are run unless they
are inlined.  Our aggregation API unfortunately is a hard fence
against inlining; solving this is a major optimization target IMO.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_trgm version 1.2

2015-07-20 Thread Alexander Korotkov
On Wed, Jul 15, 2015 at 12:31 AM, Jeff Janes  wrote:

> On Tue, Jul 7, 2015 at 6:33 AM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>>
>>
>> See Tom Lane's comment about downgrade scripts. I think just remove it is
>> a right solution.
>>
>
> The new patch removes the downgrade path and the ability to install the
> old version.
>
> (If anyone wants an easy downgrade path for testing, they can keep using
> the prior patch--no functional changes)
>
> It also added a comment before the trigramsMatchGraph call.
>
> I retained the palloc and the loop to promote the ternary array to a
> binary array.  While I also think it is tempting to get rid of that by
> abusing the type system and would do it that way in my own standalone code,
> it seems contrary to way the project usually does things.  And I couldn't
> measure a difference in performance.
>

Ok!


> 
>
>
>
>> Let's consider '^(?!.*def).*abc' regular expression as an example. It
>> matches strings which contains 'abc' and don't contains 'def'.
>>
>> # SELECT 'abc' ~ '^(?!.*def).*abc';
>>  ?column?
>> --
>>  t
>> (1 row)
>>
>> # SELECT 'def abc' ~ '^(?!.*def).*abc';
>>  ?column?
>> --
>>  f
>> (1 row)
>>
>> # SELECT 'abc def' ~ '^(?!.*def).*abc';
>>  ?column?
>> --
>>  f
>> (1 row)
>>
>> Theoretically, our trigram regex processing could support negative
>> matching of 'def' trigram, i.e. trigramsMatchGraph(abc = true, def = false)
>> = true but trigramsMatchGraph(abc = true, def = true) = false. Actually, it
>> doesn't because trigramsMatchGraph() implements a monotonic function. I
>> just think it should be stated explicitly.
>>
>
> Do you think it is likely to change to stop being monotonic and so support
> the (def=GIN_TRUE) => false case?
>
> ^(?!.*def)   seems like a profoundly niche situation.  (Although one that
> I might actually start using myself now that I know it isn't just a
> Perl-ism).
>
> It doesn't make any difference to this patch, other than perhaps how to
> word the comments.
>

Yes, it was just about the comments.

I also run few tests on real-life dataset: set of dblp paper titles. As it
was expected, there is huge speedup when pattern is long enough.

# SELECT count(*) FROM dblp_titles WHERE s ILIKE '%Multidimensional Data%';
 count
---
   318
(1 row)

Time: 29,114 ms

# SELECT count(*) FROM dblp_titles WHERE s ~* '.*Multidimensional Data.*';
 count
---
   318
(1 row)

Time: 26,273 ms

# SELECT count(*) FROM dblp_titles WHERE s ILIKE '%Multidimensional Data%';
 count
---
   318
(1 row)

Time: 249,725 ms

# SELECT count(*) FROM dblp_titles WHERE s ~* '.*Multidimensional Data.*';
 count
---
   318
(1 row)

Time: 218,627 ms

For me, patch is in the pretty good shape. I'm going to mark it "Ready for
committer".

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] All-zero page in GIN index causes assertion failure

2015-07-20 Thread Heikki Linnakangas

On 07/20/2015 11:14 AM, Heikki Linnakangas wrote:

ISTM ginvacuumcleanup should check for PageIsNew, and put the page to
the FSM. That's what btvacuumpage() gistvacuumcleanup() do.
spgvacuumpage() seems to also check for PageIsNew(), but it seems broken
in a different way: it initializes the page and marks the page as dirty,
but it is not WAL-logged. That is a problem at least if checksums are
enabled: if you crash you might have a torn page on disk, with invalid
checksum.


Looking closer, heap vacuum does a similar thing, but there are two 
mitigating factors that make it safe in practice, I think:


1. An empty heap page is all-zeroes except for the small page header in 
the beginning of the page. For a torn page to matter, the page would 
need to be torn in the header, but we rely elsewhere (control file) 
anyway that a 512-byte sector update is atomic, so that shouldn't 
happen. Note that this hinges on the fact that there is no special area 
on heap pages, so you cannot rely on this for index pages.


2. The redo of the first insert/update on a heap page will always 
re-initialize the page, even when full-page-writes are disabled. This is 
the XLOG_HEAP_INIT_PAGE optimization. So it's not just an optimization, 
it's required for correctness.



Heap update can also leave behind a page in the buffer cache that's been 
initialized by RelationGetBufferForTuple but not yet WAL-logged. 
However, it doesn't mark the buffer dirty, so the torn-page problem 
cannot happen because the page will not be flushed to disk if nothing 
else touches it. The XLOG_HEAP_INIT_PAGE optimization is needed in that 
case too, however.


B-tree, GiST, and SP-GiST's relation extension work similarly, but they 
have other mitigating factors. If a newly-initialized B-tree page is 
left behind in the relation, it won't be reused for anything, and vacuum 
will ignore it (by accident, I think; there is no explicit comment on 
what will happen to such pages, but it will be treated like an internal 
page and ignored). Eventually the buffer will be evicted from cache, and 
because it's not marked as dirty, it will not be flushed to disk, and 
will later be read back as all-zeros and vacuum will recycle it.


BRIN update is not quite right, however. brin_getinsertbuffer() can 
initialize a page, but the caller might bail out without using the page 
and WAL-logging the change. If that happens, the next update that uses 
the same page will WAL-log the change but it will not use the 
XLOG_BRIN_INIT_PAGE option, and redo will not initialize the page. Redo 
will fail.


BTW, shouldn't there be a step in BRIN vacuum that scans all the BRIN 
pages? If an empty page is missing from the FSM for any reason, there's 
nothing to add it there.


This is all very subtle. The whole business of leaving behind an 
already-initialized page in the buffer cache, without marking the buffer 
as dirty, is pretty ugly. I wish we had a more robust pattern to handle 
all-zero pages and relation extension.


Thoughts? As a minimal backpatchable fix, I think we should add the 
check in ginvacuumpage() to initialize any all-zeros pages it 
encounters. That needs to be WAL-logged, and WAL-logging needs to be 
added to the page initialization in spgvacuumpage too.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] object_classes array is broken, again

2015-07-20 Thread Tom Lane
Alvaro Herrera  writes:
> Any opinions on this idea?  I don't like it all that much, but it's
> plenty effective.

I don't like it much either.

What about adding StaticAsserts that lengthof() the relevant constant
arrays is equal to MAX_OCLASS?  (Or other similar ways of checking
that they have the right number of entries.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread sudalai

>I don't think so, because arrays can contain duplicates.

   I just add two element to the array.  One for INITCOND value NULL, second
for first row value.
So Array size is always 2.  So no duplicates. 

>rhaas=# select coalesce(first(x.column1), 'wrong') from (values
>(null), ('correct')) x;
 >coalesce
>--
> wrong
>(1 row)
It works correct.. 
I didn't said it returns, first non-null value for a column  from aggregate
window. 
 I said my implementation returns first row value  for a column. 
Here first row element is "null ", hence it returns null. 


check this
db=# select 
db-# coalesce(first(x.column1),'null') as col1 , 
db-# coalesce(first(x.column2),'null') as col2,
db-#  coalesce(first(x.column3),'null') as col3 
db-# from (values (null,'abc',null), ('correct','wrong','notsure'),
('second','second1','second3')) x 
db-# ;
 col1 | col2 | col3 
--+--+--
 null | abc  | null
(1 row)

Its work correct. It returns first row value for a column. 

--Sudalai



-
sudalai
--
View this message in context: 
http://postgresql.nabble.com/First-Aggregate-Funtion-tp1943031p5858584.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: enhanced DROP POLICY tab complete

2015-07-20 Thread Alvaro Herrera
Pavel Stehule wrote:
> Hi
> 
> I am sending trivial patch, that enforce more precious tab complete for
> DROP POLICY statement

Thanks, pushed and preciousified.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-20 Thread Petr Jelinek

On 2015-07-19 17:16, Michael Paquier wrote:

On Sat, Jul 18, 2015 at 8:00 AM, Petr Jelinek  wrote:

On 2015-07-15 06:07, Michael Paquier wrote:


On Fri, Jul 10, 2015 at 11:28 PM, Tom Lane  wrote:


Andres Freund  writes:


On July 10, 2015 4:16:59 PM GMT+02:00, Tom Lane 
wrote:


Would that propagate down through multiple levels of CASCADE?
(Although
I'm not sure it would be sensible for a non-relocatable extension to
depend on a relocatable one, so maybe the need doesn't arise in
practice.)




I'd day so. I was thinking it'd adding a flag that allows to pass a
schema to a non relocatable extension. That'd then be passed down. I
agree that it's unlikely to be used often.



Yeah, I was visualizing it slightly differently, as a separate
internal-only option "schema_if_needed", but it works out to the
same thing either way.



I added DEFAULT SCHEMA option into CREATE EXTENSION which behaves like
SCHEMA but only for extension that don't specify their schema and is
mutually exclusive with just SCHEMA. The DEFAULT SCHEMA propagates when
CASCADE is used while the SCHEMA option does not propagate. I'd like to hear
opinions about this behavior. It would be possible to propagate SCHEMA as
DEFAULT SCHEMA but that might have surprising results (installing
dependencies in same schema as the current ext).


Hm...

First, the current patch has a surprising behavior because it fails to
create an extension in cascade when creation is attempted on a custom
schema:
=# create schema po;
CREATE SCHEMA
=# create extension hstore_plperl with schema po cascade;
NOTICE:  0: installing required extension "hstore"
LOCATION:  CreateExtension, extension.c:1483
NOTICE:  0: installing required extension "plperl"
LOCATION:  CreateExtension, extension.c:1483
ERROR:  42704: type "hstore" does not exist
LOCATION:  typenameType, parse_type.c:258
Time: 30.071 ms
To facilitate the life of users, I think that the parent extensions
should be created on the same schema as its child by default. In this
case hstore should be created in po, not public.



That's actually a bug because the previous version of the patch did not 
set the reqext correctly after creating the required extension.




Hence, as a schema can only be specified in a control file for a
non-relocatable extension, I think that the schema name propagated to
the root extensions should be the one specified in the control file of
the child if it is defined in it instead of the one specified by user
(imagine for example an extension created in cascade that requires
adminpack, extension that can only be deployed in pg_catalog), and
have the root node use its own schema if it has one in its control
file by default.

For example in this case:
foo1 (WITH SCHEMA hoge) -> foo2 (schema = pg_catalog) -> foo2_1
 |
 |--> foo2_2
 ---> foo3 (no schema)
With this command:
CREATE EXTENSION foo1 WITH SCHEMA hoge;
foo3 is created on schema po. foo2, as well its required dependencies
foo2_1 and foo2_2 are created on pg_catalog.

Now DEFAULT SCHEMA is trying to achieve: "Hey, I want to define foo2_1
and foo2_2 on schema hoge". This may be worth achieving (though IMO I
think that foo1 should have direct dependencies with foo2_1 and
foo2_2), but I think that we should decide what is the default
behavior we want, and implement the additional behavior in a second
patch, separated from the patch of this thread. Personally I am more a
fan of propagating to parent extensions the schema of the child and
not of its grand-child by default, but I am not alone here :)



This is something that I as a user of the feature specifically don't 
want to happen as I install extensions either to common schema (for some 
utility ones) or into separate schemas (for the rest), but I never want 
the utility extension to go to the same schema as the other ones. This 
is especially true when installing non-relocatable extension which 
depends on relocatable one. Obviously, it does not mean that nobody else 
wants this though :)


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 54529b7e8bf2e5554569369ff976befbac981a25 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Sat, 13 Jun 2015 19:49:10 +0200
Subject: [PATCH] CREATE EXTENSION CASCADE

---
 contrib/hstore_plperl/expected/hstore_plperl.out   |   6 +-
 contrib/hstore_plperl/expected/hstore_plperlu.out  |   6 +-
 contrib/hstore_plperl/sql/hstore_plperl.sql|   4 +-
 contrib/hstore_plperl/sql/hstore_plperlu.sql   |   4 +-
 .../hstore_plpython/expected/hstore_plpython.out   |   4 +-
 contrib/hstore_plpython/sql/hstore_plpython.sql|   3 +-
 contrib/ltree_plpython/expected/ltree_plpython.out |   4 +-
 contrib/ltree_plpython/sql/ltree_plpython.sql  |   3 +-
 doc/src/sgml/ref/create_extension.sgml |  22 -
 src/backend/comman

Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-07-20 Thread Beena Emerson
Simon Riggs wrote:

> The choice between formats is not
> solely predicated on whether we have multi-line support.

> I still think writing down some actual use cases would help bring the
> discussion to a conclusion. Inventing a general facility is hard without
> some clear goals about what we need to support.

We need to at least support the following:
- Grouping: Specify of standbys along with the minimum number of commits
required from the group.
- Group Type: Groups can either be priority or quorum group.
- Group names: to simplify status reporting
- Nesting: At least 2 levels of nesting

Using JSON, sync rep parameter to replicate in 2 different clusters could be
written as: 

   {"remotes": 
{"quorum": 2, 
 "servers": [{"london": 
{"prioirty": 2, 
 "servers": ["lndn1", "lndn2", "lndn3"]
}}
,
  {"nyc":
{"priority": 1,
 "servers": ["ny1", "ny2"]
}}
  ]
}
}

The same parameter in the new language (as suggested above) could be written
as:
 'remotes: 2(london: 1[lndn1, lndn2, lndn3], nyc: 1[ny1, ny2])'

Also, I was thinking the name of the main group could be optional.
Internally, it can be given the name 'default group' or 'main group' for
status reporting.

The above could also be written as:
 '2(london: 2[lndn1, lndn2, lndn3], nyc: 1[ny1, ny2])'

backward compatible:
In JSON, while validating we may have to check if it starts with '{' to go
for JSON parsing else proceed with the current method.

A,B,C => 1[A,B,C]. This can be added in the new parser code.

Thoughts?



-
Beena Emerson

--
View this message in context: 
http://postgresql.nabble.com/Support-for-N-synchronous-standby-servers-take-2-tp5849384p5858571.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Zhaomo Yang
>
> Just to be clear, the idea of a global temp table is that the table def
> is available to all users, but the data is private to each session?


The table def is visible to all sessions and persistent, but the data is
private to each session and temporary.

Thanks,
Zhaomo


Re: [HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace

2015-07-20 Thread Michael Paquier
On Mon, Jul 20, 2015 at 4:31 PM, Marti Raudsepp  wrote:
> On Mon, Jun 22, 2015 at 9:20 PM, Robert Haas  wrote:
>> On Fri, Jun 19, 2015 at 12:10 PM, Marti Raudsepp  wrote:
>>> One of my databases failed to upgrade successfully and produced this error
>>> in the copying phase:
>>>
>>> error while copying relation "pg_catalog.pg_largeobject"
>>> ("/srv/ssd/PG_9.3_201306121/1/12023" to "/PG_9.4_201409291/1/12130"): No
>>> such file or directory
>
>> I haven't looked at the patch, but this seems like a good thing to
>> fix.  Hopefully Bruce will take a look soon.
>
> Ping? This has gone a month without a reply.
>
> Should I add this patch to a commitfest? (I was under the impression
> that bugfixes don't normally go through the commitfest process)

After a certain amount of time without anything happening, I would
recommend just adding it to a CF to have it get attention. I imagine
that it is one of the reasons why there is as well a category "Bug
Fix".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Use pg_rewind when target timeline was switched

2015-07-20 Thread Alexander Korotkov
Hackers,

attached patch allows pg_rewind to work when target timeline was switched.
Actually, this patch fixes TODO from pg_rewind comments.

  /*
   * Trace the history backwards, until we hit the target timeline.
   *
   * TODO: This assumes that there are no timeline switches on the target
   * cluster after the fork.
   */

This patch allows pg_rewind to handle data directory synchronization is
much more general way. For instance, user can return promoted standby to
old master.

In this patch target timeline history is exposed as global variable. Index
in target timeline history is used in function interfaces instead of
specifying TLI directly. Thus, SimpleXLogPageRead() can easily start
reading XLOGs from next timeline when current timeline ends.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg_rewind_target_switch.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-20 Thread Petr Jelinek

On 2015-07-20 12:18, Petr Jelinek wrote:

On 2015-07-19 22:56, Tom Lane wrote:


* You might have expected me to move the tsmseqscan and tsmpagemode flags
into the TsmRoutine struct, but instead this API puts equivalent flags
into the SampleScanState struct.  The reason for that is that it lets
the settings be determined at runtime after inspecting the TABLESAMPLE
parameters, which I think would be useful.  For example, whether to use
the bulkread strategy should probably depend on what the sampling
percentage is.



I think this ignores one aspect of the old API. That is, we allowed the
sampling method to specify which kind of input parameters it accepts.
This is why for example the tsm_system_rows accepts integer while system
and bernoulli both accept float8. This part of the API is certainly not
needed for bernoulli and system sampling but if we ever want to do
something more sophisticated like stratified sampling which Simon
mentioned several times, specifying just percentages is not going to be
enough.



Actually I see the tsmapi has support for this, so please ignore the noise.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve BRIN documentation somewhat

2015-07-20 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Improve BRIN documentation somewhat
> 
> This removes some info about support procedures being used, which was
> obsoleted by commit db5f98ab4f, as well as add some more documentation
> on how to create new opclasses using the Minmax infrastructure.
> (Hopefully we can get something similar for Inclusion as well.)

Emre Hasegeli told me on IM he's going to submit a patch to add
something similar for the inclusion opclass framework.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-20 Thread Petr Jelinek

On 2015-07-19 22:56, Tom Lane wrote:

Since I'm not observing any movement on the key question of redesigning
the tablesample method API, and I think that's something that's absolutely
necessary to fix for 9.5, attached is an attempt to respecify the API.



Sorry, I got something similar to what you posted written as well, but I 
didn't want to submit before I have more working code done.




* I got rid of the TableSampleDesc struct altogether in favor of giving
the execution functions access to the whole SampleScanState executor
state node.  If we add support for filtering at the join level, filtering
in indexscan nodes, etc, those would become separate sets of API functions
in my view; there is no need to pretend that this set of API functions
works for anything except the SampleScan case.  This way is more parallel
to the FDW precedent, too.  In particular it lets tablesample code get at
the executor's EState, which the old API did not, but which might be
necessary for some scenarios.


Ok.



* You might have expected me to move the tsmseqscan and tsmpagemode flags
into the TsmRoutine struct, but instead this API puts equivalent flags
into the SampleScanState struct.  The reason for that is that it lets
the settings be determined at runtime after inspecting the TABLESAMPLE
parameters, which I think would be useful.  For example, whether to use
the bulkread strategy should probably depend on what the sampling
percentage is.



I think this ignores one aspect of the old API. That is, we allowed the 
sampling method to specify which kind of input parameters it accepts. 
This is why for example the tsm_system_rows accepts integer while system 
and bernoulli both accept float8. This part of the API is certainly not 
needed for bernoulli and system sampling but if we ever want to do 
something more sophisticated like stratified sampling which Simon 
mentioned several times, specifying just percentages is not going to be 
enough.




* As written, this still allows TABLESAMPLE parameters to have null
values, but I'm pretty strongly tempted to get rid of that: remove
the paramisnull[] argument and make the core code reject null values.
I can't see any benefit in allowing null values that would justify the
extra code and risks-of-omission involved in making every tablesample
method check this for itself.



I am for not allowing NULLs.


* I specified that omitting NextSampleBlock is allowed and causes the
core code to do a standard seqscan, including syncscan support, which
is a behavior that's impossible with the current API.  If we fix
the bernoulli code to have history-independent sampling behavior,
I see no reason that syncscan shouldn't be enabled for it.



Umm, we were actually doing syncscan as well before.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug

2015-07-20 Thread Jeevan Chalke
On Sat, Jul 18, 2015 at 12:27 AM, Andrew Gierth  wrote:

> > "Kyotaro" == Kyotaro HORIGUCHI 
> writes:
>
>  Kyotaro> Hello, this looks to be a kind of thinko. The attached patch
>  Kyotaro> fixes it.
>
> No, that's still wrong. Just knowing that there is a List is not enough
> to tell whether to concat it or append it.
>
> Jeevan's original patch tries to get around this by making the RowExpr
> case wrap another List around its result (which is then removed by the
> concat), but this is the wrong approach too because it breaks nested
> RowExprs (which isn't valid syntax in the spec, because the spec allows
> only column references in GROUP BY, not arbitrary expressions, but which
> we have no reason not to support).
>
> Attached is the current version of my fix (with Jeevan's regression
> tests plus one of mine).
>

Looks good to me.


>
> --
> Andrew (irc:RhodiumToad)
>
>


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] object_classes array is broken, again

2015-07-20 Thread Alvaro Herrera
Any opinions on this idea?  I don't like it all that much, but it's
plenty effective.

Alvaro Herrera wrote:
 
> The problem is that there aren't enough callers of add_object_address:
> there are many indexes of that array that aren't ever accessed and so
> it's not obvious when the array is broken.  If we were to put
> OCLASS_CLASS at the end instead of at the beginning, that would fix the
> problem by making it immediately obvious when things get broken this
> way, because the value used in the most common case would shift around
> every time we add another value.  (Of course, we'd have to instruct
> people to not add new members after the pg_class entry.)

> diff --git a/src/backend/catalog/dependency.c 
> b/src/backend/catalog/dependency.c
> index c1212e9..0107c53 100644
> --- a/src/backend/catalog/dependency.c
> +++ b/src/backend/catalog/dependency.c
> @@ -127,7 +127,6 @@ typedef struct
>   * See also getObjectClass().
>   */
>  static const Oid object_classes[MAX_OCLASS] = {
> - RelationRelationId, /* OCLASS_CLASS */
>   ProcedureRelationId,/* OCLASS_PROC */
>   TypeRelationId, /* OCLASS_TYPE */
>   CastRelationId, /* OCLASS_CAST */
> @@ -158,7 +157,9 @@ static const Oid object_classes[MAX_OCLASS] = {
>   DefaultAclRelationId,   /* OCLASS_DEFACL */
>   ExtensionRelationId,/* OCLASS_EXTENSION */
>   EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */
> - PolicyRelationId/* OCLASS_POLICY */
> + PolicyRelationId,   /* OCLASS_POLICY */
> + TransformRelationId,/* OCLASS_POLICY */
> + RelationRelationId  /* OCLASS_CLASS */
>  };
>  
>  
> diff --git a/src/include/catalog/dependency.h 
> b/src/include/catalog/dependency.h
> index 5da18c2..6f4802d 100644
> --- a/src/include/catalog/dependency.h
> +++ b/src/include/catalog/dependency.h
> @@ -112,11 +112,10 @@ typedef struct ObjectAddresses ObjectAddresses;
>  
>  /*
>   * This enum covers all system catalogs whose OIDs can appear in
> - * pg_depend.classId or pg_shdepend.classId.
> + * pg_depend.classId or pg_shdepend.classId.  See also object_classes[].
>   */
>  typedef enum ObjectClass
>  {
> - OCLASS_CLASS,   /* pg_class */
>   OCLASS_PROC,/* pg_proc */
>   OCLASS_TYPE,/* pg_type */
>   OCLASS_CAST,/* pg_cast */
> @@ -149,6 +148,11 @@ typedef enum ObjectClass
>   OCLASS_EVENT_TRIGGER,   /* pg_event_trigger */
>   OCLASS_POLICY,  /* pg_policy */
>   OCLASS_TRANSFORM,   /* pg_transform */
> + /*
> +  * Keep this previous-to-last, see
> +  * https://www.postgresql.org/message-id/
> +  */
> + OCLASS_CLASS,   /* pg_class */
>   MAX_OCLASS  /* MUST BE LAST */
>  } ObjectClass;
>  

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Pavel Stehule
2015-07-20 11:07 GMT+02:00 Alvaro Herrera :

> Pavel Stehule wrote:
> > 2015-07-20 5:33 GMT+02:00 Gavin Flower :
> >
> > > Would it be difficult to add the ability for one user to share the
> > > contents with a list of named other users (roles)?
> >
> > Probably it is possible, but not for temporary data - short data are in
> > process memory, so it are not accessible from other sessions.
> >
> > This sharing tables needs:
> >
> > 1. some infrastructure to hold data about sharing - who can share with
> what
> > 2. who will clean data? temporary data are cleaned on end of transaction
> or
> > end of session
> > 3. data should be saved in shared memory instead process memory
> >
> > So it is possible, but partially different
>
> To me this gets in the "crazy ideas" list.  Please add it to the TODO
> page in the wiki, so that we're sure we never implement it.
>

yes, it is pretty crazy - Have no plan to implement it :)

Pavel


> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY

2015-07-20 Thread Andres Freund
On 2015-07-17 21:40:45 +0300, Heikki Linnakangas wrote:
> Hmm, that function is pretty fragile, it will segfault on any AT_* type that
> it doesn't recognize. Thankfully you get that compiler warning, but we have
> added AT_* type codes before in minor releases.

For in-core code that is supposed to handle all cases I find it much
better to get a compiler warning than to error out in a default:
case. The latter is very likely not going to be exercised by any test
and thus not be noticed for prolonged time.

Might make sense to test for -Werror=switch and add that to the compiler
flags by default.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-20 Thread Thakur, Sameer
Hello,
>Does this actually handle multiple indexes? It doesn't appear so, which I'd 
>think is a significant problem... :/
Please find v2 attached which does this.
>I'm also not seeing how this will deal with exhausting maintenance_work_mem. 
>ISTM that when that happens you'd definitely want a better idea of what's 
>going on...
Will work on this aspect in v3.
Thank you,
Sameer

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


IndexScanProgress_v2.patch
Description: IndexScanProgress_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY

2015-07-20 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
> > On 07/18/2015 04:15 PM, Alvaro Herrera wrote:
> > >Heikki Linnakangas wrote:
> > >>If it's there just to so you can run the regression tests that come
> > >>with it, it might make sense to just add a "default" case to that
> > >>switch to handle any unrecognized commands, and perhaps even remove
> > >>the cases for the currently untested subcommands as it's just dead
> > >>code.
> > >
> > >Well, I would prefer to have an output that says "unrecognized" and then
> > >add more test cases to the SQL files so that there's not so much dead
> > >code.  I prefer that to removing the C support code, because then as
> > >we add extra tests we don't need to modify the C source.
> > 
> > Ok. I added a case for AT_ReAddComment, and also a default-case that says
> > "unrecognized".
> 
> Oh, sorry, I forgot to tell you that I was working on it.

I pushed the remaining part of my patch.  Thanks for fixing the other
bits.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Andres Freund
On 2015-07-20 15:33:32 +1200, Gavin Flower wrote:
> Would it be difficult to add the ability for one user to share the contents
> with a list of named other users (roles)?

No need. That feature is called unlogged tables and grants.

Doing this for temporary tables would be horrible. They live in process
local memory and not shared memory. Because that provides higher
isolation, not even though it does.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Alvaro Herrera
Pavel Stehule wrote:
> 2015-07-20 5:33 GMT+02:00 Gavin Flower :
> 
> > Would it be difficult to add the ability for one user to share the
> > contents with a list of named other users (roles)?
> 
> Probably it is possible, but not for temporary data - short data are in
> process memory, so it are not accessible from other sessions.
> 
> This sharing tables needs:
> 
> 1. some infrastructure to hold data about sharing - who can share with what
> 2. who will clean data? temporary data are cleaned on end of transaction or
> end of session
> 3. data should be saved in shared memory instead process memory
> 
> So it is possible, but partially different

To me this gets in the "crazy ideas" list.  Please add it to the TODO
page in the wiki, so that we're sure we never implement it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY

2015-07-20 Thread Alvaro Herrera
Heikki Linnakangas wrote:
> On 07/18/2015 04:15 PM, Alvaro Herrera wrote:
> >Heikki Linnakangas wrote:
> >>If it's there just to so you can run the regression tests that come
> >>with it, it might make sense to just add a "default" case to that
> >>switch to handle any unrecognized commands, and perhaps even remove
> >>the cases for the currently untested subcommands as it's just dead
> >>code.
> >
> >Well, I would prefer to have an output that says "unrecognized" and then
> >add more test cases to the SQL files so that there's not so much dead
> >code.  I prefer that to removing the C support code, because then as
> >we add extra tests we don't need to modify the C source.
> 
> Ok. I added a case for AT_ReAddComment, and also a default-case that says
> "unrecognized".

Oh, sorry, I forgot to tell you that I was working on it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] All-zero page in GIN index causes assertion failure

2015-07-20 Thread Heikki Linnakangas
This is a continuation of the discussion at 
http://www.postgresql.org/message-id/CAMkU=1zUc=h0oczntajaqqw7gxxvxcwsyq8dd2t7ohgsgve...@mail.gmail.com, 
I'm starting a new thread as this is a separate issue than the original 
LWLock bug.



On Thu, Jul 16, 2015 at 12:03 AM, Jeff Janes  wrote:


On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas 
wrote:

I don't see how this is related to the LWLock issue, but I didn't see it
without your patch.  Perhaps the system just didn't survive long enough to
uncover it without the patch (although it shows up pretty quickly).  It
could just be an overzealous Assert, since the casserts off didn't show
problems.



bt and bt full are shown below.

Cheers,

Jeff

#0  0x003dcb632625 in raise () from /lib64/libc.so.6
#1  0x003dcb633e05 in abort () from /lib64/libc.so.6
#2  0x00930b7a in ExceptionalCondition (
conditionName=0x9a1440 "!(((PageHeader) (page))->pd_special >=
(__builtin_offsetof (PageHeaderData, pd_linp)))", errorType=0x9a12bc
"FailedAssertion",
fileName=0x9a12b0 "ginvacuum.c", lineNumber=713) at assert.c:54
#3  0x004947cf in ginvacuumcleanup (fcinfo=0x7fffee073a90) at
ginvacuum.c:713



It now looks like this *is* unrelated to the LWLock issue.  The assert that
it is tripping over was added just recently (302ac7f27197855afa8c) and so I
had not been testing under its presence until now.  It looks like it is
finding all-zero pages (index extended but then a crash before initializing
the page?) and it doesn't like them.

(gdb) f 3
(gdb) p *(char[8192]*)(page)
$11 = '\000' 

Presumably before this assert, such pages would just be permanently
orphaned.


Yeah, so it seems. It's normal to have all-zero pages in the index, if 
you crash immediately after the relation has been extended, but before 
the new page has been WAL-logged. What is your test case like; did you 
do crash-testing?


ISTM ginvacuumcleanup should check for PageIsNew, and put the page to 
the FSM. That's what btvacuumpage() gistvacuumcleanup() do. 
spgvacuumpage() seems to also check for PageIsNew(), but it seems broken 
in a different way: it initializes the page and marks the page as dirty, 
but it is not WAL-logged. That is a problem at least if checksums are 
enabled: if you crash you might have a torn page on disk, with invalid 
checksum.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-07-20 Thread Haribabu Kommi
On Mon, Jul 20, 2015 at 3:31 PM, Amit Kapila  wrote:
> On Fri, Jul 17, 2015 at 1:22 PM, Haribabu Kommi 
> wrote:
>>
>> On Thu, Jul 16, 2015 at 1:10 PM, Amit Kapila 
>> wrote:
>> > Thanks, I will fix this in next version of patch.
>> >
>>
>> I am posting in this thread as I am not sure, whether it needs a
>> separate thread or not?
>>
>> I gone through the code and found that the newly added funnel node is
>> is tightly coupled with
>> partial seq scan, in order to add many more parallel plans along with
>> parallel seq scan,
>> we need to remove the integration of this node with partial seq scan.
>>
>
> This assumption is wrong, Funnel node can execute any node beneath
> it (Refer ExecFunnel->funnel_getnext->ExecProcNode, similarly you
> can see exec_parallel_stmt).

Yes, funnel node can execute any node beneath it. But during the planning
phase, the funnel path is added on top of partial scan path. I just want the
same to enhanced to support other parallel nodes.

> Yes, currently nodes supported under
> Funnel nodes are limited like partialseqscan, result (due to reasons
> mentioned upthread like readfuncs.s doesn't have support to read Plan
> nodes which is required for worker backend to read the PlannedStmt,
> ofcourse we can add them, but as we are supportting parallelism for
> limited nodes, so I have not enhanced the readfuncs.c) but in general
> the basic infrastructure is designed such a way that it can support
> other nodes beneath it.
>
>> To achieve the same, I have the following ideas.
>>
>>
>> Execution:
>> The funnel execution varies based on the below plan node.
>> 1) partial scan - Funnel does the local scan also and returns the tuples
>> 2) partial agg - Funnel does the merging of aggregate results and
>> returns the final result.
>>
>
> Basically Funnel will execute any node beneath it, the Funnel node itself
> is not responsible for doing local scan or any form of consolidation of
> results, as of now, it has these 3 basic properties
> – Has one child, runs multiple copies in parallel.
> – Combines the results into a single tuple stream.
> – Can run the child itself if no workers available.

+ if (!funnelstate->local_scan_done)
+ {
+ outerPlan = outerPlanState(funnelstate);
+
+ outerTupleSlot = ExecProcNode(outerPlan);

From the above code in funnel_getnext function, it directly does the
calls the below
node to do the scan in the backend side also. This code should refer the below
node type, based on that only it can go for the backend scan.

I feel executing outer plan always may not be correct for other parallel nodes.

>> Any other better ideas to achieve the same?
>>
>
> Refer slides 16-19 in Parallel Sequential Scan presentation in PGCon
> https://www.pgcon.org/2015/schedule/events/785.en.html

Thanks for the information.

> I don't have very clear idea what is the best way to transform the nodes
> in optimizer, but I think we can figure that out later unless majority
> people
> see that as blocking factor.

I am also not finding it as a blocking factor for parallel scan.
I written the above mail to get some feedback/suggestions from hackers on
how to proceed in adding other parallelism nodes along with parallel scan.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-07-20 Thread Simon Riggs
On 20 July 2015 at 08:18, Beena Emerson  wrote:

>  Simon Riggs wrote:
>
> >synchronous_standby_name= is already 25 characters, so that leaves 115
> characters - are they always single byte chars?
>
> I am sorry, I did not get why there is a 140 byte limit. Can you please
> explain?
>

Hmm, sorry, I thought Robert had said there was a 140 byte limit. I misread.

I don't think that affects my point. The choice between formats is not
solely predicated on whether we have multi-line support.

I still think writing down some actual use cases would help bring the
discussion to a conclusion. Inventing a general facility is hard without
some clear goals about what we need to support.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


[HACKERS] Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY

2015-07-20 Thread Heikki Linnakangas

On 07/18/2015 04:15 PM, Alvaro Herrera wrote:

Heikki Linnakangas wrote:

If it's there just to so you can run the regression tests that come
with it, it might make sense to just add a "default" case to that
switch to handle any unrecognized commands, and perhaps even remove
the cases for the currently untested subcommands as it's just dead
code.


Well, I would prefer to have an output that says "unrecognized" and then
add more test cases to the SQL files so that there's not so much dead
code.  I prefer that to removing the C support code, because then as
we add extra tests we don't need to modify the C source.


Ok. I added a case for AT_ReAddComment, and also a default-case that 
says "unrecognized".

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace

2015-07-20 Thread Marti Raudsepp
On Mon, Jun 22, 2015 at 9:20 PM, Robert Haas  wrote:
> On Fri, Jun 19, 2015 at 12:10 PM, Marti Raudsepp  wrote:
>> One of my databases failed to upgrade successfully and produced this error
>> in the copying phase:
>>
>> error while copying relation "pg_catalog.pg_largeobject"
>> ("/srv/ssd/PG_9.3_201306121/1/12023" to "/PG_9.4_201409291/1/12130"): No
>> such file or directory

> I haven't looked at the patch, but this seems like a good thing to
> fix.  Hopefully Bruce will take a look soon.

Ping? This has gone a month without a reply.

Should I add this patch to a commitfest? (I was under the impression
that bugfixes don't normally go through the commitfest process)

Regards,
Marti Raudsepp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-07-20 Thread Beena Emerson
 Simon Riggs wrote: 

>synchronous_standby_name= is already 25 characters, so that leaves 115
characters - are they always single byte chars?

I am sorry, I did not get why there is a 140 byte limit. Can you please
explain?




-
Beena Emerson

--
View this message in context: 
http://postgresql.nabble.com/Support-for-N-synchronous-standby-servers-take-2-tp5849384p5858502.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-07-20 Thread Ashutosh Bapat
Hi Hanada-san,
I have reviewed usermapping patch and here are some comments.

The patch applies cleanly on Head and compiles without problem. The make
check in regression folder does not show any failure.

In find_user_mapping(), if the first cache search returns a valid tuple, it
is checked twice for validity, un-necessarily. Instead if the first search
returns a valid tuple, it should be returned immediately. I see that this
code was copied from GetUserMapping(), where as well it had the same
problem.

In build_join_rel(), we check whether user mappings of the two joining
relations are same. If the user mappings are same, doesn't that imply that
the foreign server and local user are same too?

Rest of the patch looks fine.

On Thu, May 28, 2015 at 10:50 AM, Shigeru Hanada 
wrote:

> 2015-05-22 18:37 GMT+09:00 Shigeru Hanada :
> > 2015-05-21 23:11 GMT+09:00 Robert Haas :
> >> On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
> >>  wrote:
> >>> d) All relations must have the same effective user id
> >>> This check is done with userid stored in PgFdwRelationInfo, which is
> >>> valid only when underlying relations have the same effective user id.
> >>> Here "effective user id" means id of the user executing the query, or
> >>> the owner of the view when the foreign table is accessed via view.
> >>> Tom suggested that it is necessary to check that user mapping matches
> >>> for security reason, and now I think it's better than checking
> >>> effective user as current postgres_fdw does.
> >>
> >> So, should this be a separate patch?
>
> I wrote patches for this issue.  Let me describe their design.
>
> To require the matching of user mapping between two relations (base or
> join) which involves foreign tables, it would require these stuffs:
>
> a) Add new field umid to RelOptInfo to hold OID of user mapping used
> for the relation and children
> b) Propagate umid up to join relation only when both outer and inner
> have the save valid values
> c) Check matching of umid between two relations to be joined before
> calling GetForeignJoinPaths
>
> For a), adding an OID field would not be a serious burden.  Obtaining
> the OID of user mapping can be accomplished by calling GetUserMapping
> in get_relation_info, but it allocates an unnecessary UserMapping
> object, so I added GetUserMappingId which just returns OID.
>
> One concern about getting user mapping is checkAsUser.  Currently FDW
> authors are required to consider which user is valid as argument of
> GetUserMapping, because it is different from the result of GetUserId
> when the target relation is accessed via a view which is owned by
> another user.  This requirement would be easily ignored by FDW authors
> and the ignorance causes terrible security issues.  So IMO it should
> be hidden in the core code, and FDW authors should use user mappings
> determined by the core.  This would break FDW I/F, so we can't change
> it right now, but making GetUserMapping obsolete and mention it in the
> documents would guide FDW authors to the right direction.
>
> For b), such check should be done in build_join_rel, similarly to serverid.
>
> For c), we can reuse the check about RelOptInfo#fdwroutine in
> add_paths_to_joinrel, because all of serverid, umid and fdwroutine are
> valid only when both the servers and the user mappings match between
> outer and inner relations.
>
> Attached are the patches which implement the idea above except
> checkAsUser issue.
> usermapping_matching.patch: check matching of user mapping OIDs
> add_GetUserMappingById.patch: add helper function which is handy for
> FDWs to obtain UserMapping
> foreign_join_v16.patch: postgres_fdw which assumes user mapping
> matching is done in core
>
> Another idea about a) is to have an entire UserMapping object for each
> RelOptInfo, but it would require UserMapping to have extra field of
> its Oid, say umid, to compare them by OID.  But IMO creating
> UserMapping for every RelOptInfo seems waste of space and time.
>
> Thoughts?
> --
> Shigeru HANADA
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company