Re: [HACKERS] CLUSTER command progress monitor

2017-08-31 Thread Thomas Munro
On Thu, Aug 31, 2017 at 2:12 PM, Tatsuro Yamada
 wrote:
> Any comments or suggestion are welcome.

Although this patch updates src/test/regress/expected/rules.out I
think perhaps you included the wrong version?  That regression test
fails for me

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SERIALIZABLE with parallel query

2017-08-31 Thread Thomas Munro
On Wed, Jun 28, 2017 at 11:21 AM, Thomas Munro
 wrote:
> [ssi-parallel-v5.patch]

Rebased.

-- 
Thomas Munro
http://www.enterprisedb.com


ssi-parallel-v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-08-31 Thread Michael Paquier
On Fri, Sep 1, 2017 at 12:25 AM, Bossart, Nathan  wrote:
> On 8/31/17, 2:24 AM, "Masahiko Sawada"  wrote:
>> I reviewed these patches and found a issue.
>
> Thanks for reviewing.
>
>> autovacuum worker seems not to work fine. I got an error message;
>>
>> ERROR:  unrecognized node type: 0
>> CONTEXT:  automatic analyze of table "postgres.public.hoge"
>>
>> I think we should set T_RangeVar to rangevar.type in
>> autovacuum_do_vac_analyze function.
>
> Yes, it looks like the NodeTag is not getting set on the RangeVar.
> I went ahead and switched this to makeRangeVar(...) instead of
> keeping it manually allocated on the stack.  Autovacuum seems to be
> working as usual now.

Hm. Here is the diff between v11 and v12:
 static void
 autovacuum_do_vac_analyze(autovac_table *tab, BufferAccessStrategy bstrategy)
 {
-   RangeVarrangevar;
-   VacuumRelation *rel;
-
-   /* Set up command parameters --- use local variables instead of palloc */
-   MemSet(, 0, sizeof(rangevar));
-
-   rangevar.schemaname = tab->at_nspname;
-   rangevar.relname = tab->at_relname;
-   rangevar.location = -1;
+   RangeVar*rangevar;
+   VacuumRelation  *rel;

/* Let pgstat know what we're doing */
autovac_report_activity(tab);

-   rel = makeVacuumRelation(, NIL, tab->at_relid);
+   rangevar = makeRangeVar(tab->at_nspname, tab->at_relname, -1);
+   rel = makeVacuumRelation(rangevar, NIL, tab->at_relid);
But there is this commit in vacuum.c:
 * It is the caller's responsibility that all parameters are allocated
in a
 * memory context that will not disappear at transaction commit.
And I don't think we want to break that promise as newNode() uses
palloc0fast() which allocates data in the current memory context (see
4873c96f). I think that you had better just use NodeSetTag here and be
done with it. Also, it seems to me that this could be fixed as a
separate patch. It is definitely an incorrect pattern...

-   $$ = (Node *)n;
+   $$ = (Node *) n;
Spurious noise. And the coding pattern in gram.y is to not add a space
(make new code look like its surroundings as the documentation says).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-08-31 Thread Thomas Munro
On Tue, Aug 1, 2017 at 12:53 AM, Peter Moser  wrote:
> On 06.04.2017 01:24, Andres Freund wrote:
>>
>> Unfortunately I don't think this patch has received sufficient design
>> and implementation to consider merging it into v10.  As code freeze is
>> in two days, I think we'll have to move this to the next commitfest.
>
>
> We rebased our patch on top of commit
> 393d47ed0f5b764341c7733ef60e8442d3e9bdc2
> from "Mon Jul 31 11:24:51 2017 +0900".

Hi Peter,

This patch still applies, but no longer compiles:

nodeTemporalAdjustment.c: In function ‘ExecTemporalAdjustment’:
nodeTemporalAdjustment.c:286:21: error: incompatible types when
assigning to type ‘Form_pg_attribute’ from type
‘FormData_pg_attribute’
   node->datumFormat = curr->tts_tupleDescriptor->attrs[tc->attNumP1 - 1];
 ^

After commits 2cd70845 and c6293249 you need to change expressions of
that format to, for example:

   node->datumFormat = TupleDescAttr(curr->tts_tupleDescriptor,
tc->attNumP1 - 1);

Thanks!

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surjective functional indexes

2017-08-31 Thread Thomas Munro
On Fri, Jun 9, 2017 at 8:08 PM, Konstantin Knizhnik
 wrote:
> Attached please find rebased version of the patch.
> Now "projection" attribute is used instead of surjective/injective.

Hi Konstantin,

This still applies but it doesn't compile after commits 2cd70845 and
c6293249.  You need to change this:

  Form_pg_attribute att = RelationGetDescr(indexDesc)->attrs[i];

... to this:

  Form_pg_attribute att = TupleDescAttr(RelationGetDescr(indexDesc), i);

Thanks!

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding the commit to a patch's thread

2017-08-31 Thread Michael Paquier
On Fri, Sep 1, 2017 at 1:11 PM, Erik Rijkers  wrote:
> Would it be possible to change the commitfest a bit and make it possible to
> add the commit (or commit-message, or hash) to the thread in the
> commitfest-app.  I would think it would be best to make it so that when the
> thread gets set to state 'committed', the actual commit/hash is added
> somewhere at the same time.

There could be multiple commits. One thing that can be done now is to
add an annotation with the message telling that a portion has been
committed, and mention the commit number involved, say with the first
7-8 characters.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] adding the commit to a patch's thread

2017-08-31 Thread Erik Rijkers
At the moment it's not easy to find the commit that terminates a 
commitfest thread about a patch.  One has to manually compare dates and 
guess what belongs to what.  The commit message nowadays often has the 
link to the thread ("Discussion") but the other way around is often not 
so easily found.


For example: looking at

  https://commitfest.postgresql.org/14/1020/

One cannot directly find the actual commit that finished it.

Would it be possible to change the commitfest a bit and make it possible 
to add the commit (or commit-message, or hash) to the thread in the 
commitfest-app.  I would think it would be best to make it so that when 
the thread gets set to state 'committed', the actual commit/hash is 
added somewhere at the same time.



thanks,

Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Creating backup history files for backups taken from standbys

2017-08-31 Thread Michael Paquier
On Thu, Aug 10, 2017 at 4:55 PM, Masahiko Sawada  wrote:
> On Thu, Aug 10, 2017 at 4:50 PM, Michael Paquier
>  wrote:
>> On Thu, Aug 10, 2017 at 9:45 AM, Masahiko Sawada  
>> wrote:
>>> Thank you for the patch. Regarding to creating the backup history file
>>> on stanbys, is there any difference from the patch posted on earlier
>>> thread?
>>
>> That's a rebased version on top of what has been applied, except that
>> I have fixed an incorrect comment and shuffled the code building the
>> WAL segment name for the stop position.
>
> Understood, I'll review this patch.

Here is an updated patch with refreshed documentation, as a result of
449338c which was discussed in thread
https://www.postgresql.org/message-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8...@pgmasters.net.
I am just outlining the fact that a backup history file gets created
on standbys and that it is archived.
-- 
Michael


standby-backup-history-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes.

2017-08-31 Thread Michael Paquier
On Thu, Aug 31, 2017 at 9:29 PM, Nikolay Shaplov  wrote:
> I am about to set "Ready for commit" status to this patch. So there is my
> summary for commiter, so one does not need to carefully read all the thread.
>
> This patch is consists of three parts. May be they should be commited
> separately, then Fabien will split them, I think.

I am not sure why, but this email has broken the whole thread on my
gmail client... Did you change the subject name indirectly? The thread
is shaped correctly in postgresql.org though.

Thanks!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upcoming commit fest will begin soon

2017-08-31 Thread Michael Paquier
Hi all,

At the moment of this email, it is 15:25 AOE, so you still have close
to 8 hours to register patches for the upcoming the commit fest:
https://commitfest.postgresql.org/14/

This commit fest is large, as expected from any first commit fest for
a new development cycle, with still close to 210 patches waiting for
review, input from author and a committer. And it is as well the
largest commit fest in PG history.

Please do not forget that for any patch submitted, an author should
review one patch of similar difficulty. This is hard to quantify, but
do not hesitate to review patches in areas of the code that you have
little or no knowledge of. I have myself 10 patches still pending in
the CF, so I am planning to take my share of lookup and comments.

Another question is: who would like to become the CF manager for this
time? This commit fest is very large, so it is going to be difficult
for one person to do the whole thing, hence at least 2 people would be
welcome. Please note that I am not directly volunteering for this one,
but per my number of patches I need to look at many things, so I may
be considered as a sort of co-manager ;)
New heads for this exercise would be nice though.

Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: bgw_type (was Re: [HACKERS] Why does logical replication launcher set application_name?)

2017-08-31 Thread Michael Paquier
On Fri, Sep 1, 2017 at 4:49 AM, Peter Eisentraut
 wrote:
> On 5/30/17 23:10, Peter Eisentraut wrote:
>> Here is a proposed solution that splits bgw_name into bgw_type and
>> bgw_name_extra.  bgw_type shows up in pg_stat_activity.backend_type.
>> Uses of application_name are removed, because they are no longer
>> necessary to identity the process type.
>
> Updated patch incorporating the feedback.  I have kept bgw_name as it
> was and just added bgw_type completely independently.

- errmsg("terminating background worker \"%s\" due to
administrator command",
-MyBgworkerEntry->bgw_name)));
+ errmsg("terminating %s due to administrator command",
+MyBgworkerEntry->bgw_type)));
"terminating background worker %s of type %s due to administrator
command", bgw_name, bgw_type?

> One open question is how to treat a missing (empty) bgw_type.  I
> currently fill in bgw_name as a fallback.  We could also treat it as an
> error or a warning as a transition measure.

Hm. Why not reporting an empty type string as NULL at SQL level and
just let it empty them? I tend to like more interfaces that report
exactly what is exactly registered at memory-level, because that's
easier to explain to users and in the documentation, as well as easier
to interpret and easier for module developers.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-31 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 31, 2017 at 10:55 PM, Tom Lane  wrote:
>> ALTER OPERATOR FAMILY ADD FUNCTION ... ?
>> 
>> That would result in the functions being considered "loose" in the
>> family rather than bound into an operator class.  I think that's
>> actually the right thing, because they shouldn't be considered
>> to be required.

> But wouldn't that result in a different effect than the core data type
> changes I just did?

Possibly --- I have not read that patch.  But considering that all core
functions are pinned anyway, it doesn't seem like it much matters whether
we consider them to be loosely or tightly bound to opclasses.  That
should only matter if one tries to drop the function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 10:55 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think this takes care of adding not only the infrastructure but
>> support for all the core data types, but I'm not quite sure how to
>> handle upgrading types in contrib.  It looks like citext, hstore, and
>> several data types provided by isn have hash opclasses, and I think
>> that there's no syntax for adding a support function to an existing
>> opclass.  We could add that, but I'm not sure how safe it would be.
>
> ALTER OPERATOR FAMILY ADD FUNCTION ... ?
>
> That would result in the functions being considered "loose" in the
> family rather than bound into an operator class.  I think that's
> actually the right thing, because they shouldn't be considered
> to be required.

But wouldn't that result in a different effect than the core data type
changes I just did?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-31 Thread Tom Lane
Robert Haas  writes:
> I think this takes care of adding not only the infrastructure but
> support for all the core data types, but I'm not quite sure how to
> handle upgrading types in contrib.  It looks like citext, hstore, and
> several data types provided by isn have hash opclasses, and I think
> that there's no syntax for adding a support function to an existing
> opclass.  We could add that, but I'm not sure how safe it would be.

ALTER OPERATOR FAMILY ADD FUNCTION ... ?

That would result in the functions being considered "loose" in the
family rather than bound into an operator class.  I think that's
actually the right thing, because they shouldn't be considered
to be required.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 8:40 AM, amul sul  wrote:
> Fixed in the attached version.

I fixed these up a bit and committed them.  Thanks.

I think this takes care of adding not only the infrastructure but
support for all the core data types, but I'm not quite sure how to
handle upgrading types in contrib.  It looks like citext, hstore, and
several data types provided by isn have hash opclasses, and I think
that there's no syntax for adding a support function to an existing
opclass.  We could add that, but I'm not sure how safe it would be.

TBH, I really don't care much about fixing isn, but it seems like
fixing citext and hstore would be worthwhile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Authentication mechanisms categorization

2017-08-31 Thread Bruce Momjian
On Thu, Jul 20, 2017 at 01:00:50AM +0300, Álvaro Hernández Tortosa wrote:
> I'm mostly convinced by the power of all the parameters that already
> exist, given that you added both saslname and saslchannelbinding to the
> already existing sslmode. That's great, and allows for very fine choosing of
> the auth method. So it would be great if (non-libpq) driver implementations
> would expose the same parameter names to the users. I will study this for
> JDBC.

Coming in late here, but the way TLS prevents authentication downgrade
attacks is for the sender to send a list of supported authentication
methods, and a hash of the supported authentication methods with a
random number and a secret shared with the other end, and send that.  If
the list doesn't match the hash, it means the list is invalid.

The secret prevents attackers from faking connections.  I think the
problem is that we don't have a consistent secret shared between the
client and the server. We have md5 and SCRAM, but that doesn't help
because the secret it tied to the authentication methods.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [PATCH] Fix drop replication slot blocking instead of returning error

2017-08-31 Thread Noah Misch
On Tue, Aug 29, 2017 at 12:04:33PM +0200, Alvaro Herrera wrote:
> Noah Misch wrote:
> > On Thu, Aug 24, 2017 at 03:38:20PM +0200, Simone Gotti wrote:
> > > I noticed that in postgres 10beta3, calling pg_drop_replication_slot on an
> > > active slot will block until it's released instead of returning an error
> > > like
> > > done in pg 9.6. Since this is a change in the previous behavior and the 
> > > docs
> > > wasn't changed I made a patch to restore the previous behavior.
> 
> > > after git commit 9915de6c1cb calls to pg_drop_replication_slot or the
> > > replication protocol DROP_REPLICATION_SLOT command will block when a
> > > slot is in use instead of returning an error as before (as the doc
> > > states).
> > > 
> > > This patch will set nowait to true to restore the previous
> > > behavior.
> 
> > The above-described topic is currently a PostgreSQL 10 open item.
> 
> Looking at it now -- next report tomorrow.

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-31 Thread Michael Paquier
On Fri, Sep 1, 2017 at 5:11 AM, David Steele  wrote:
> On 8/31/17 4:04 PM, Robert Haas wrote:
>> On Wed, Aug 30, 2017 at 8:37 PM, Michael Paquier
>>  wrote:
>>> Thanks for the new version. This looks fine to me.
>>
>> Committed to REL9_6_STABLE with minor wordsmithing.
>
> The edits look good to me. Thanks, Robert!

Thanks David for the patch, and Robert for the commit! The final
result is nicely shaped.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Visual Studio 2017 Build Support

2017-08-31 Thread Tanay Varma

Hello,

This is with respect to the original thread on "visual studio 2017 build 
support" created by Haribabu Kommi (kommi.harib...@gmail.com).

https://www.postgresql.org/message-id/CAJrrPGcZpraBCe6fJ963kVzKdM7AWPTYmXJ=8neap87wed9...@mail.gmail.com

Firstly, I would like to thank Haribabu Kommi for authoring the patch.

I am posting a small update to the final patch submitted by Haribabu Kommi to 
also support the recent v15.3 Release of Visual Stuido 2017 which upgrades the 
VC tools to version 14.11.

It would be great if this patch could be accepted so that Postgres could be 
built using the latest VS tools.

I have attached a copy of the output of the regression tests to confirm that 
this patch works. (check.txt)

Thanks,

Tanay Varma



0001-VS-2017-build-support-to-PostgreSQL-updated.patch
Description: 0001-VS-2017-build-support-to-PostgreSQL-updated.patch
Setting up temp install

Installing version 11 for release in 
D:/PostgresContribWork/submit/postgres/tmp_install
Copying build output 
files...
Copying config files..
Copying Import libraries...
Copying contrib data 
files...
Copying Public headers..
Copying Libpq headers..
Copying Libpq internal headers..
Copying Internal headers...
Copying Server headers
Copying Grammar header.
..
Copying PL/pgSQL header.
70 File(s) copied
1 File(s) copied
87 File(s) copied
35 File(s) copied
19 File(s) copied
1 File(s) copied
54 File(s) copied
6 File(s) copied
2 File(s) copied
10 File(s) copied
13 File(s) copied
1 File(s) copied
19 File(s) copied
25 File(s) copied
22 File(s) copied
39 File(s) copied
2 File(s) copied
10 File(s) copied
5 File(s) copied
20 File(s) copied
7 File(s) copied
34 File(s) copied
2 File(s) copied
51 File(s) copied
6 File(s) copied
7 File(s) copied
87 File(s) copied
Copying ECPG headers...
Copying ECPG informix headers...
Copying timezone names..
Copying timezone sets...
Copying BKI files...
Copying SQL files..
Copying Information schema data.
Generating conversion proc script...
Generating timezone files...
Generating tsearch script..
Copying Stopword files..
Copying Dictionaries sample files.
Copying PL Extension files...
Installation complete.
== creating temporary instance==
== initializing database system   ==
== starting postmaster==
running on port 60848 with PID 356
== creating database "regression" ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test tablespace   ... ok
parallel group (20 tests):  text txid float4 int2 boolean oid int4 char money 
name varchar regproc float8 int8 uuid pg_lsn bit enum numeric rangetypes
 boolean  ... ok
 char ... ok
 name ... ok
 varchar  ... ok
 text ... ok
 int2 ... ok
 int4 ... ok
 int8 ... ok
 oid  ... ok
 float4   ... ok
 float8   ... ok
 bit  ... ok
 numeric  ... ok
 txid ... ok
 uuid ... ok
 enum ... ok
 money... ok
 rangetypes   ... ok
 pg_lsn   ... ok
 regproc  ... ok
test strings  ... ok
test numerology   ... ok
parallel group (20 tests):  lseg point polygon tstypes line reltime path 
macaddr tinterval circle time abstime timetz macaddr8 date interval inet box 
timestamp timestamptz
 point... ok
 lseg ... ok
 line ... ok
 box  ... ok
 path ... ok
 polygon  ... ok
 circle   ... ok
 date ... ok
 time ... ok
 timetz   ... ok
 timestamp... ok
 timestamptz  ... ok
 interval ... ok
 abstime  ... ok
 reltime  ... ok
 tinterval... ok
 inet ... ok
 macaddr  ... ok
 macaddr8 ... ok
 tstypes

Re: [HACKERS] static assertions in C++

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 6:37 PM, Tom Lane  wrote:
> Meh.  We support ancient versions of C for backwards compatibility
> reasons, but considering that compiling backend code with C++ isn't
> officially supported at all, I'm not sure we need to cater to ancient
> C++ compilers.  We could quibble about the value of "ancient" of
> course --- Peter, do you have an idea when this construct became
> widely supported?
>
> I do think it might be a better idea to put a #error there instead
> of silently disabling static assertions.  Then at least we could
> hope to get complaints if anyone *is* trying to use ancient C++,
> and thereby gauge whether it's worth working any harder for this.

I guess my question was whether we couldn't just use the same
workaround we use for old C compilers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] static assertions in C++

2017-08-31 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 31, 2017 at 4:43 PM, Peter Eisentraut
>  wrote:
>> As discussed in
>> , a
>> more general solution would be to add specific C++ support for static
>> assertions in c.h.  Here is a patch for that, extracted from my
>> previously posted C++ patch set, but also a bit reworked from what was
>> previously posted.

> I like the concept of being more C++-compatible, but I'm not sure
> about the idea of not providing a workaround,

Meh.  We support ancient versions of C for backwards compatibility
reasons, but considering that compiling backend code with C++ isn't
officially supported at all, I'm not sure we need to cater to ancient
C++ compilers.  We could quibble about the value of "ancient" of
course --- Peter, do you have an idea when this construct became
widely supported?

I do think it might be a better idea to put a #error there instead
of silently disabling static assertions.  Then at least we could
hope to get complaints if anyone *is* trying to use ancient C++,
and thereby gauge whether it's worth working any harder for this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] static assertions in C++

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 4:43 PM, Peter Eisentraut
 wrote:
> Commit df1a699e5ba3232f373790b2c9485ddf720c4a70 introduced a
> StaticAssertStmt() into a header file, which will fail if a module
> written in C++ uses that header file.  Currently, that header file is
> not widely used, but it's a potential problem if the use of static
> assertions expands.
>
> As discussed in
> , a
> more general solution would be to add specific C++ support for static
> assertions in c.h.  Here is a patch for that, extracted from my
> previously posted C++ patch set, but also a bit reworked from what was
> previously posted.

I like the concept of being more C++-compatible, but I'm not sure
about the idea of not providing a workaround, given that we went to
the extreme of doing this:

#define StaticAssertStmt(condition, errmessage) \
((void) sizeof(struct { int static_assert_failure :
(condition) ? 1 : -1; }))
-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] static assertions in C++

2017-08-31 Thread Peter Eisentraut
Commit df1a699e5ba3232f373790b2c9485ddf720c4a70 introduced a
StaticAssertStmt() into a header file, which will fail if a module
written in C++ uses that header file.  Currently, that header file is
not widely used, but it's a potential problem if the use of static
assertions expands.

As discussed in
, a
more general solution would be to add specific C++ support for static
assertions in c.h.  Here is a patch for that, extracted from my
previously posted C++ patch set, but also a bit reworked from what was
previously posted.

Also attached is a little C++ test file that one can use to test this
out.  (Just compiling it should cause a compiler error without the patch
and a static assertion failure with the patch.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 93adbf7bfa7afc1c8e25b037fce8227f3a225639 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 30 Aug 2016 12:00:00 -0400
Subject: [PATCH v3] Add support for static assertions in C++

This allows modules written in C++ to use or include header files that
use StaticAssertStmt() etc.
---
 src/include/c.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/include/c.h b/src/include/c.h
index af799dc1df..e238edb7d1 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -749,6 +749,7 @@ typedef NameData *Name;
  * about a negative width for a struct bit-field.  This will not include a
  * helpful error message, but it beats not getting an error at all.
  */
+#ifndef __cplusplus
 #ifdef HAVE__STATIC_ASSERT
 #define StaticAssertStmt(condition, errmessage) \
do { _Static_assert(condition, errmessage); } while(0)
@@ -760,6 +761,18 @@ typedef NameData *Name;
 #define StaticAssertExpr(condition, errmessage) \
StaticAssertStmt(condition, errmessage)
 #endif /* HAVE__STATIC_ASSERT 
*/
+#else  /* C++ */
+#if defined(__cpp_static_assert) && __cpp_static_assert >= 200410
+#define StaticAssertStmt(condition, errmessage) \
+   static_assert(condition, errmessage)
+#else
+/* not worth providing a workaround */
+#define StaticAssertStmt(condition, errmessage) \
+   ((void) 0)
+#endif
+#define StaticAssertExpr(condition, errmessage) \
+   ({ StaticAssertStmt(condition, errmessage); true; })
+#endif /* C++ */
 
 
 /*

base-commit: b5c75feca7ffb2667c42b86286e262d6cb709b76
-- 
2.14.1

extern "C" {
#include "postgres.h"
}

extern "C"
int
somefunction(void)
{
StaticAssertStmt(1 + 1 == 3, "arithmetic!");

return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-08-31 Thread Tom Lane
Peter Eisentraut  writes:
> As an aside, is there a reason why the archiver process is not included
> in pg_stat_activity?

It's not connected to shared memory.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-31 Thread David Steele
On 8/31/17 4:04 PM, Robert Haas wrote:
> On Wed, Aug 30, 2017 at 8:37 PM, Michael Paquier
>  wrote:
>> Thanks for the new version. This looks fine to me.
> 
> Committed to REL9_6_STABLE with minor wordsmithing.

The edits look good to me. Thanks, Robert!

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-31 Thread Robert Haas
On Wed, Aug 30, 2017 at 8:37 PM, Michael Paquier
 wrote:
> Thanks for the new version. This looks fine to me.

Committed to REL9_6_STABLE with minor wordsmithing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sync process names between ps and pg_stat_activity

2017-08-31 Thread Peter Eisentraut
The process names shown in pg_stat_activity.backend_type as of PG10 and
the process names used in the ps display are in some cases gratuitously
different, so here is a patch to make them more alike.  Of course it
could be debated in some cases which spelling was better.

As an aside, is there a reason why the archiver process is not included
in pg_stat_activity?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 7b6509d284001af17d3c7c365e9db35d4015ec48 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 31 Aug 2017 15:55:49 -0400
Subject: [PATCH] Sync process names between ps and pg_stat_activity

Remove gratuitous differences in the process names shown in
pg_stat_activity.backend_type and the ps output.
---
 doc/src/sgml/monitoring.sgml| 12 ++--
 src/backend/bootstrap/bootstrap.c   | 10 +-
 src/backend/postmaster/autovacuum.c |  4 ++--
 src/backend/postmaster/pgarch.c |  2 +-
 src/backend/postmaster/pgstat.c |  2 +-
 src/backend/postmaster/postmaster.c |  4 ++--
 src/backend/postmaster/syslogger.c  |  2 +-
 7 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5575c2c837..61456dc12f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -49,11 +49,11 @@ Standard Unix Tools
 
 $ ps auxww | grep ^postgres
 postgres  15551  0.0  0.1  57536  7132 pts/0S18:02   0:00 postgres -i
-postgres  15554  0.0  0.0  57536  1184 ?Ss   18:02   0:00 postgres: 
writer process
-postgres  1  0.0  0.0  57536   916 ?Ss   18:02   0:00 postgres: 
checkpointer process
-postgres  15556  0.0  0.0  57536   916 ?Ss   18:02   0:00 postgres: 
wal writer process
-postgres  15557  0.0  0.0  58504  2244 ?Ss   18:02   0:00 postgres: 
autovacuum launcher process
-postgres  15558  0.0  0.0  17512  1068 ?Ss   18:02   0:00 postgres: 
stats collector process
+postgres  15554  0.0  0.0  57536  1184 ?Ss   18:02   0:00 postgres: 
background writer
+postgres  1  0.0  0.0  57536   916 ?Ss   18:02   0:00 postgres: 
checkpointer
+postgres  15556  0.0  0.0  57536   916 ?Ss   18:02   0:00 postgres: 
walwriter
+postgres  15557  0.0  0.0  58504  2244 ?Ss   18:02   0:00 postgres: 
autovacuum launcher
+postgres  15558  0.0  0.0  17512  1068 ?Ss   18:02   0:00 postgres: 
stats collector
 postgres  15582  0.0  0.0  58772  3080 ?Ss   18:04   0:00 postgres: 
joe runbug 127.0.0.1 idle
 postgres  15606  0.0  0.0  58772  3052 ?Ss   18:07   0:00 postgres: 
tgl regression [local] SELECT waiting
 postgres  15610  0.0  0.0  58772  3056 ?Ss   18:07   0:00 postgres: 
tgl regression [local] idle in transaction
@@ -102,7 +102,7 @@ Standard Unix Tools
 (1 row)
 
 $ ps aux|grep server1
-postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: 
server1: writer process
+postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: 
server1: background writer
 ...
 
   
diff --git a/src/backend/bootstrap/bootstrap.c 
b/src/backend/bootstrap/bootstrap.c
index 0453fd4ac1..2354ce43ae 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -308,19 +308,19 @@ AuxiliaryProcessMain(int argc, char *argv[])
switch (MyAuxProcType)
{
case StartupProcess:
-   statmsg = "startup process";
+   statmsg = pgstat_get_backend_desc(B_STARTUP);
break;
case BgWriterProcess:
-   statmsg = "writer process";
+   statmsg = pgstat_get_backend_desc(B_BG_WRITER);
break;
case CheckpointerProcess:
-   statmsg = "checkpointer process";
+   statmsg = 
pgstat_get_backend_desc(B_CHECKPOINTER);
break;
case WalWriterProcess:
-   statmsg = "wal writer process";
+   statmsg = pgstat_get_backend_desc(B_WAL_WRITER);
break;
case WalReceiverProcess:
-   statmsg = "wal receiver process";
+   statmsg = 
pgstat_get_backend_desc(B_WAL_RECEIVER);
break;
default:
statmsg = "??? process";
diff --git a/src/backend/postmaster/autovacuum.c 
b/src/backend/postmaster/autovacuum.c
index 776b1c0a9d..b745d8962e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -436,7 +436,7 @@ AutoVacLauncherMain(int argc, char *argv[])

Re: [HACKERS] expanding inheritance in partition bound order

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 2:56 AM, Ashutosh Bapat
 wrote:
> Here are the patches revised a bit. I have esp changed the variable
> names and arguments to reflect their true role in the functions. Also
> updated prologue of expand_single_inheritance_child() to mention
> "has_child". Let me know if those changes look good.

Sure.  Committed as you have it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


bgw_type (was Re: [HACKERS] Why does logical replication launcher set application_name?)

2017-08-31 Thread Peter Eisentraut
On 5/30/17 23:10, Peter Eisentraut wrote:
> Here is a proposed solution that splits bgw_name into bgw_type and
> bgw_name_extra.  bgw_type shows up in pg_stat_activity.backend_type.
> Uses of application_name are removed, because they are no longer
> necessary to identity the process type.

Updated patch incorporating the feedback.  I have kept bgw_name as it
was and just added bgw_type completely independently.

One open question is how to treat a missing (empty) bgw_type.  I
currently fill in bgw_name as a fallback.  We could also treat it as an
error or a warning as a transition measure.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 73142adf6e56e44d97bd9f855072cba17ef5ea4c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 31 Aug 2017 12:24:47 -0400
Subject: [PATCH v2] Add background worker type

Add bgw_type field to background worker structure.  It is intended to be
set to the same value for all workers of the same type, so they can be
grouped in pg_stat_activity, for example.

The backend_type column in pg_stat_activity now shows bgw_type for a
background worker.  The ps listing and some log messages no longer call
out that a process is a "background worker" but just show the bgw_type.
That way, being a background worker is effectively an implementation
detail now that is not shown to the user.
---
 contrib/pg_prewarm/autoprewarm.c   |  6 ++--
 doc/src/sgml/bgworker.sgml | 12 +--
 src/backend/access/transam/parallel.c  |  1 +
 src/backend/postmaster/bgworker.c  | 53 +++---
 src/backend/postmaster/postmaster.c| 10 ++
 src/backend/replication/logical/launcher.c |  3 ++
 src/backend/utils/adt/pgstatfuncs.c| 16 +++--
 src/include/postmaster/bgworker.h  |  2 ++
 src/test/modules/test_shm_mq/setup.c   |  2 +-
 src/test/modules/worker_spi/worker_spi.c   |  8 +++--
 10 files changed, 91 insertions(+), 22 deletions(-)

diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index cc0350e6d6..006c3153db 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -800,7 +800,8 @@ apw_start_master_worker(void)
worker.bgw_start_time = BgWorkerStart_ConsistentState;
strcpy(worker.bgw_library_name, "pg_prewarm");
strcpy(worker.bgw_function_name, "autoprewarm_main");
-   strcpy(worker.bgw_name, "autoprewarm");
+   strcpy(worker.bgw_name, "autoprewarm master");
+   strcpy(worker.bgw_type, "autoprewarm master");
 
if (process_shared_preload_libraries_in_progress)
{
@@ -840,7 +841,8 @@ apw_start_database_worker(void)
worker.bgw_start_time = BgWorkerStart_ConsistentState;
strcpy(worker.bgw_library_name, "pg_prewarm");
strcpy(worker.bgw_function_name, "autoprewarm_database_main");
-   strcpy(worker.bgw_name, "autoprewarm");
+   strcpy(worker.bgw_name, "autoprewarm worker");
+   strcpy(worker.bgw_type, "autoprewarm worker");
 
/* must set notify PID to wait for shutdown */
worker.bgw_notify_pid = MyProcPid;
diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index b422323081..822632bf02 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -51,6 +51,7 @@ Background Worker Processes
 typedef struct BackgroundWorker
 {
 charbgw_name[BGW_MAXLEN];
+charbgw_type[BGW_MAXLEN];
 int bgw_flags;
 BgWorkerStartTime bgw_start_time;
 int bgw_restart_time;   /* in seconds, or BGW_NEVER_RESTART */
@@ -64,8 +65,15 @@ Background Worker Processes
   
 
   
-   bgw_name is a string to be used in log messages, process
-   listings and similar contexts.
+   bgw_name and bgw_type are
+   strings to be used in log messages, process listings and similar contexts.
+   bgw_type should be the same for all background
+   workers of the same type, so that it is possible to group such workers in a
+   process listing, for example.  bgw_name on the
+   other hand can contain additional information about the specific process.
+   (Typically, the string for bgw_name will contain
+   the string for bgw_type somehow, but that is not
+   strictly required.)
   
 
   
diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index 17b10383e4..9828259e6d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -438,6 +438,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
memset(, 0, sizeof(worker));
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
+   snprintf(worker.bgw_type, BGW_MAXLEN, "parallel worker");
worker.bgw_flags =
BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
| 

Re: [HACKERS] expanding inheritance in partition bound order

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 3:36 AM, Amit Langote
 wrote:
> ISTM, the primary motivation for the EIBO patch at this point is to get
> the partitions ordered in a predictable manner so that the partition-wise
> join patch and update partition key patches could implement certain logic
> using O (n) algorithm rather than an O (n^2) one.

That's part of it, but not the whole thing.  For example, BASIC
partition-wise join only needs a predictable order, not a
bound-ordered one.  But the next step is to be able to match up uneven
bounds - e.g. given [1000, 2000), [3000, 4000), [5000, 6000) on one
side and [1100, 2100), [2900,3900), and [5500,5600) on the other side,
we can still make it work.  That greatly benefits from being able to
iterate through all the bounds in order.

> Neither of them depend
> on the actual order in the sense of, say, sticking a PathKey to the
> resulting Append.  Perhaps, the patch to"Make the optimiser aware of
> partitions ordering" [1] will have to consider this somehow; maybe by
> limiting its scope to only the cases where the root partitioned table is
> range partitioned.

I think that doing a depth-first traversal as I've done here avoids
the need to limit it to that case.  If we did a breadth-first
traversal anything that was subpartitioned would end up having the
subpartitions at the end instead of in the sequence, but the
depth-first traversal avoids that issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel worker error

2017-08-31 Thread Robert Haas
On Wed, Aug 30, 2017 at 11:19 AM, Robert Haas  wrote:
> But since that's an established design fl^H^Hprinciple, maybe that
> means we should go with the approach of teaching SerializeGUCState()
> to ignore role altogether and instead have ParallelWorkerMain call
> SetCurrentRoleId using information passed via the FixedParallelState
> (not sure of the precise details here).

Could I get some opinions on the virtues of this approach, vs. any of
the other suggestions at or near
http://postgr.es/m/ca+tgmoasp90e33-mu2ypgs73ttj37m5hv-xqhjc7tpqx9wx...@mail.gmail.com
?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assorted leaks and weirdness in parallel execution

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 2:13 PM, Tom Lane  wrote:
> Yeah, it is different.  What I'm looking at is that nodeGather does
> DestroyTupleQueueReader as soon as it's seen EOF on a given tuple queue.
> That can't save any worker cycles.  The reason seems to be that it wants
> to collapse its array of TupleQueueReader pointers so only live queues are
> in it.  That's reasonable, but I'm inclined to implement it by making the
> Gather node keep a separate working array of pointers to only the live
> TupleQueueReaders.  The ParallelExecutorInfo would keep the authoritative
> array of all TupleQueueReaders that have been created, and destroy them in
> ExecParallelFinish.

Hmm, that's a thought.

> Your point is that we want to shut down the TupleQueueReaders immediately
> on rescan, which we do already.  Another possible scenario is to shut them
> down once we've reached the passed-down tuple limit (across the whole
> Gather, not per-child which is what 3452dc524 implemented).  I don't think
> what I'm suggesting would complicate that.

Yeah.  I think the way to do that would be to implement what is
mentioned in the comment for ExecShutdownNode: call that function on
the child plan as soon as the LIMIT is filled.

(Hmm, the reference to someday covering FDW in the header of that
comment is obsolete, isn't it?  Another oversight on my part.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Assert that the correct locks are held when calling PageGetLSN()

2017-08-31 Thread Jacob Champion
Hello all,

While working on checksum support for GPDB, we noticed that several
callers of PageGetLSN() didn't follow the correct locking procedure.
To try to help ferret out present and future mistakes, we added an
assertion to PageGetLSN() that checks whether those locks were being
held correctly. This patch is a first-draft attempt to port that
assertion back up to postgres master, based on work by Asim Praveen,
Ashwin Agrawal, and myself.

The patch is really two pieces: add the assertion, and fix the callers
that would trip it. The latter part is still in progress, because I'm
running into some places where I'm not sure what the correct way
forward is.

(I'm a newbie to this list and this code base, so please don't
hesitate to correct me on anything, whether that's code- or
culture-related!)

= Notes/Observations =

- This change introduces a subtle source incompatibility: whereas
previously both Pages and PageHeaders could be passed to PageGetLSN(),
now callers must explicitly cast to Page if they don't already have
one.

- I originally tried to put (and the GPDB patch succeeds in putting)
the entire assertion implementation into PageGetLSN in bufpage.h. That
seems unworkable here -- buf_internals.h is no longer publicized, and
the assertion needs those internals to perform its checks. So I moved
the assertion into its own helper function in bufmgr.c. If assertions
are disabled, that helper declaration gets turned into an empty macro,
so there shouldn't be a performance hit.

= Open Questions =

- Is my use of FRONTEND here correct? The documentation made it seem
like some compilers could try to link against the
AssertPageIsLockedForLSN function, even if PageGetLSN was never
called.

- Use of PageGetLSN() in heapam.c seems to be incorrect, but I don't
really know what the best way is to fix it. It explicitly unlocks the
buffer, with the comment that visibilitymap_set() needs to handle
locking itself, but then calls PageGetLSN() to determine whether
visibilitymap_set() needs to be called.

- TestForOldSnapshot is also problematic. The function is called in
places where only a shared lock is held, so it needs to hold the
header spinlock at least some of the time, but we only pass the Page
as an argument. I could do the same "find buffer descriptor from page
pointer" logic that we utilize in the assertion implementation, but is
there a better way?

- Should I try to add this assertion to PageSetLSN() as well? We were
focused mostly on the GetLSN side of things, since that was the more
risky piece of our backport. PageSetLSN is called from many more
places, and the review there would be much larger.

= Known Issues =

- This approach doesn't seem to work when checksums are disabled. In
that case, BufferGetLSNAtomic isn't actually atomic, so it seems to
always trip the assertion. I'm not sure of the best way to proceed --
is it really correct not to follow the locking contract if checksums
are disabled?

What do you think? Is this something worth pursuing? Any and all
comments welcome.

Thanks!
--Jacob


PageGetLSN-assert-locks-held.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assorted leaks and weirdness in parallel execution

2017-08-31 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 31, 2017 at 11:09 AM, Tom Lane  wrote:
>> (With this patch,
>> there are no callers of shm_mq_get_queue(); should we remove that?)

> May as well.  I can't remember any more why I did shm_mq_detach() that
> way; I think there was someplace where I thought that the
> shm_mq_handle might not be available.  Maybe I'm misremembering, or
> perhaps the situation has changed as that code has evolved.

I initially tried to convert the on_dsm_detach callback to take a
pointer to the shm_mq_handle rather than the shm_mq proper.  That
caused regression test crashes in some processes, indicating that
there are situations where we have freed the shm_mq_handle before
the DSM detach happens.  I think it was only during worker process exit.
That's sort of contrary to the advice in shm_mq.c about the desirable
lifespan of a shm_mq_handle, but I didn't feel like trying to fix it.
It seems generally more robust if the on_dsm_detach callback assumes
as little as possible about intra-process state, anyway.

I don't have any strong reason to remove shm_mq_get_queue(), other than
neatnik-ism.  It might save a caller having to remember the shm_mq pointer
separately.  Given the set of API functions, that would only matter if
somebody wanted to set/get the sender/receiver PGPROC pointers later,
but maybe that's a plausible thing to do.

>> It seems like a significant modularity violation that execParallel.c
>> is responsible for creating those shm_mqs but not for cleaning them up.

> Yeah, the correct division of labor between execParallel.c and
> nodeGather.c was not entirely clear to me, and I don't pretend that I
> got that 100% right.

OK, I'll have a go at that.

>> (That would make it more difficult to do the early reader destruction
>> that nodeGather currently does, but I am not sure we care about that.)

> I think the only thing that matters here is -- if we know that we're
> not going to read any more tuples from a worker that might still be
> generating tuples, it's imperative that we shut it down ASAP.
> Otherwise, it's just going to keep cranking them out, wasting
> resources unnecessarily.  I think this is different than what you're
> talking about here, but just wanted to be clear.

Yeah, it is different.  What I'm looking at is that nodeGather does
DestroyTupleQueueReader as soon as it's seen EOF on a given tuple queue.
That can't save any worker cycles.  The reason seems to be that it wants
to collapse its array of TupleQueueReader pointers so only live queues are
in it.  That's reasonable, but I'm inclined to implement it by making the
Gather node keep a separate working array of pointers to only the live
TupleQueueReaders.  The ParallelExecutorInfo would keep the authoritative
array of all TupleQueueReaders that have been created, and destroy them in
ExecParallelFinish.

Your point is that we want to shut down the TupleQueueReaders immediately
on rescan, which we do already.  Another possible scenario is to shut them
down once we've reached the passed-down tuple limit (across the whole
Gather, not per-child which is what 3452dc524 implemented).  I don't think
what I'm suggesting would complicate that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The case for removing replacement selection sort

2017-08-31 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 4:59 PM, Peter Geoghegan  wrote:
> I may submit the simple patch to remove replacement selection, if
> other contributors are receptive. Apart from everything else, the
> "incrementalism" of replacement selection works against cleverer batch
> memory management of the type I just outlined, which seems like a
> useful direction to take tuplesort in.

I attach a patch to remove replacement selection, which I'll submit to CF 1.


-- 
Peter Geoghegan
From 6ccad74113d3a13264a19653e13ef897be5c902d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Wed, 30 Aug 2017 16:14:36 -0700
Subject: [PATCH] Remove replacement selection sort.

This simplifies tuplesort.c, since heap maintenance routines need no
longer concern themselves with cases where two runs are represented
within a heap.

Replacement selection sort's traditional advantages no longer allow it
to outperform a simple sort-merge strategy, even in sympathetic cases.
This is due to long term trends in hardware performance characteristics,
and enhancements to the tuplesort.c merge code in the past couple of
years.
---
 doc/src/sgml/config.sgml  |  39 ---
 src/backend/utils/init/globals.c  |   1 -
 src/backend/utils/misc/guc.c  |  10 -
 src/backend/utils/misc/postgresql.conf.sample |   1 -
 src/backend/utils/sort/tuplesort.c| 415 +++---
 src/include/miscadmin.h   |   1 -
 src/test/regress/expected/cluster.out |  17 +-
 src/test/regress/sql/cluster.sql  |  14 +-
 8 files changed, 51 insertions(+), 447 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2b6255e..7c625e2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1513,45 +1513,6 @@ include_dir 'conf.d'
   
  
 
- 
-  replacement_sort_tuples (integer)
-  
-   replacement_sort_tuples configuration parameter
-  
-  
-  
-   
-When the number of tuples to be sorted is smaller than this number,
-a sort will produce its first output run using replacement selection
-rather than quicksort.  This may be useful in memory-constrained
-environments where tuples that are input into larger sort operations
-have a strong physical-to-logical correlation.  Note that this does
-not include input tuples with an inverse
-correlation.  It is possible for the replacement selection algorithm
-to generate one long run that requires no merging, where use of the
-default strategy would result in many runs that must be merged
-to produce a final sorted output.  This may allow sort
-operations to complete sooner.
-   
-   
-The default is 150,000 tuples.  Note that higher values are typically
-not much more effective, and may be counter-productive, since the
-priority queue is sensitive to the size of available CPU cache, whereas
-the default strategy sorts runs using a cache
-oblivious algorithm.  This property allows the default sort
-strategy to automatically and transparently make effective use
-of available CPU cache.
-   
-   
-Setting maintenance_work_mem to its default
-value usually prevents utility command external sorts (e.g.,
-sorts used by CREATE INDEX to build B-Tree
-indexes) from ever using replacement selection sort, unless the
-input tuples are quite wide.
-   
-  
- 
-
  
   autovacuum_work_mem (integer)
   
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 7c09498..9680a4b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -112,7 +112,6 @@ bool		enableFsync = true;
 bool		allowSystemTableMods = false;
 int			work_mem = 1024;
 int			maintenance_work_mem = 16384;
-int			replacement_sort_tuples = 15;
 
 /*
  * Primary determinants of sizes of shared-memory structures.
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 246fea8..a459672 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1932,16 +1932,6 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"replacement_sort_tuples", PGC_USERSET, RESOURCES_MEM,
-			gettext_noop("Sets the maximum number of tuples to be sorted using replacement selection."),
-			gettext_noop("When more tuples than this are present, quicksort will be used.")
-		},
-		_sort_tuples,
-		15, 0, INT_MAX,
-		NULL, NULL, NULL
-	},
-
 	/*
 	 * We use the hopefully-safely-small value of 100kB as the compiled-in
 	 * default for max_stack_depth.  InitializeGUCOptions will increase it if
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index df5d2f3..d0177d1 100644
--- 

Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 2:26 AM, Dilip Kumar  wrote:
> Thanks for the feedback.  I will work on it.

Another thought is that you probably want/need to test across a range
of work_mem values.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GnuTLS support

2017-08-31 Thread Andreas Karlsson

Hi,

I have seen discussions from time to time about OpenSSL and its 
licensing issues so I decided to see how much work it would be to add 
support for another TLS library, and  I went with GnuTLS since it is the 
library I know best after OpenSSL and it is also a reasonably popular 
library.


Attached is a work in progress patch which implements the basics. I send 
it the list because I want feedback on some design questions and to 
check with the community if this is a feature we want.


= What is implemented

- Backend
- Frontend
- Diffie-Hellmann support
- Using GnuTLS for secure random numbers
- Using GnuTLS for sha2

= Work left to do

- Add GnuTLS support to sslinfo
- Add GnuTLS support to pgcrypto
- Support for GnuTLS's version of engines
- Test code with older versions of GnuTLS
- Update documentation
- Add support for all postgresql.conf options (see design question)
- Fix two failing tests (see design question)

= Design questions

== GnuTLS priority strings vs OpenSSL cipher lists

GnuTLS uses a different format for specifying ciphers. Instead of 
setting ciphers, protocol versions, and ECDH curves separately GnuTLS 
instead uses a single priority string[1].


For example the default settings of PostgreSQL (which include disabling 
SSLv3)


ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
ssl_prefer_server_ciphers = on
ssl_ecdh_curve = 'prime256v1'

is represented with a string similar to

SECURE128:+3DES-CBC:+GROUP-SECP256R1:%SERVER_PRECEDENCE

So the two libraries have decided on different ways to specify things.

One way to solve th issue would be to just let ssl_ciphers be the 
priority string and then add %SERVER_PRECEDENCE to it if 
ssl_prefer_server_ciphers is on. Adding the ssl_ecdh_curve is trickier 
since the curves have different names in GnuTLS (e.g. prime256v1 vs 
SECP256R1) and I would rather avoid having to add such a mapping to 
PostgreSQL. Thoughts?


== Potentially OpenSSL-specific  est cases

There are currently two failing SSL tests which at least to me seems 
more like they test specific OpenSSL behaviors rather than something 
which need to be true for all SSL libraries.


The two tests:

# Try with just the server CA's cert. This fails because the root file
# must contain the whole chain up to the root CA.
note "connect with server CA cert, without root CA";
test_connect_fails("sslrootcert=ssl/server_ca.crt sslmode=verify-ca");

# A CRL belonging to a different CA is not accepted, fails
test_connect_fails(
"sslrootcert=ssl/root+server_ca.crt sslmode=verify-ca 
sslcrl=ssl/client.crl");


For the missing root CA case GnuTLS seems to be happy enough with just 
an intermediate CA and as far as I can tell this behavior is not easy to 
configure.


And for the CRL belonging to a different CA case GnuTLS can be 
configured to either just store such a non-validating CRL or to ignore 
it, but not to return an error.


Personally I think thee two tests should just be removed but maybe I am 
missing something.


Notes:

1. https://gnutls.org/manual/html_node/Priority-Strings.html

Andreas
diff --git a/configure b/configure
index a2f9a256b4..8dcb26b532 100755
--- a/configure
+++ b/configure
@@ -709,6 +709,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_gnutls
 with_openssl
 krb_srvtab
 with_python
@@ -838,6 +839,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_gnutls
 with_selinux
 with_systemd
 with_readline
@@ -1534,6 +1536,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-gnutls   build with GnuTS support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -6051,6 +6054,41 @@ fi
 $as_echo "$with_openssl" >&6; }
 
 
+#
+# GnuTLS
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5
+$as_echo_n "checking whether to build with GnuTLS support... " >&6; }
+
+
+
+# Check whether --with-gnutls was given.
+if test "${with_gnutls+set}" = set; then :
+  withval=$with_gnutls;
+  case $withval in
+yes)
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_gnutls=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5
+$as_echo "$with_gnutls" >&6; }
+
+
 #
 # SELinux
 #
@@ -10218,6 +10256,67 @@ done
 
 fi
 
+if test "$with_gnutls" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - 

Re: [HACKERS] Assorted leaks and weirdness in parallel execution

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 11:09 AM, Tom Lane  wrote:
> I complained a couple weeks ago that nodeGatherMerge looked like it
> leaked a lot of memory when commanded to rescan.  Attached are three
> proposed patches that, in combination, demonstrably result in zero
> leakage across repeated rescans.

Gosh, thanks for looking into this so deeply.  I apologize for all of
the bugs.  Also, my ego is taking some severe damage here.

> But it's going to crash and burn someday.

Yeah, ouch.

> (With this patch,
> there are no callers of shm_mq_get_queue(); should we remove that?)

May as well.  I can't remember any more why I did shm_mq_detach() that
way; I think there was someplace where I thought that the
shm_mq_handle might not be available.  Maybe I'm misremembering, or
perhaps the situation has changed as that code has evolved.

> The last patch fixes the one remaining leak I saw after applying the
> first two patches, namely that execParallel.c leaks the array palloc'd
> by ExecParallelSetupTupleQueues --- just the array storage, not any of
> the shm_mq_handles it points to.  The given patch just adds a pfree
> to ExecParallelFinish, but TBH I find this pretty unsatisfactory.
> It seems like a significant modularity violation that execParallel.c
> is responsible for creating those shm_mqs but not for cleaning them up.

Yeah, the correct division of labor between execParallel.c and
nodeGather.c was not entirely clear to me, and I don't pretend that I
got that 100% right.

> (That would make it more difficult to do the early reader destruction
> that nodeGather currently does, but I am not sure we care about that.)

I think the only thing that matters here is -- if we know that we're
not going to read any more tuples from a worker that might still be
generating tuples, it's imperative that we shut it down ASAP.
Otherwise, it's just going to keep cranking them out, wasting
resources unnecessarily.  I think this is different than what you're
talking about here, but just wanted to be clear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-08-31 Thread Andreas Karlsson
I have attached a new, rebased version of the batch with most of Banck's 
and some of your feedback incorporated. Thanks for the good feedback!


On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX 
SCHEMA CONCURRENTLY public on the regression

database I am bumping into a bunch of these warnings:
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123


I failed to reproduce this. Do you have a reproducible test case?


+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+   tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.


Sorry, but I am not sure I understand your suggestion. I do not like the 
ResetTupleDescCache function so all suggestions are welcome.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] name
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?


I have thought some about this myself and I do not care strongly either way.


- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.


I am no fan of the current code duplication and how fragile it is, but I 
think these cases are sufficiently different to prevent meaningful code 
reuse. But it could just be me who is unfamiliar with that part of the code.



- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.


I am still leaning towards my current tradeoff since waiting for all 
queries to stop using an index can take a lot of time and if you only 
have to do that once per table it would be a huge benefit under some 
workloads, and you can still reindex each index separately if you need to.


Andreas

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index dda0170886..c97944b2c9 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,7 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
  Acquired by VACUUM (without FULL),
  ANALYZE, CREATE INDEX CONCURRENTLY,
- CREATE STATISTICS and
+ REINDEX CONCURRENTLY, CREATE STATISTICS and
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..4ef3a89a29 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
   An index build with the CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
-  convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  convenient to use REINDEX to rebuild them.
  
 
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 

 
+   
+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+

 VERBOSE
 
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+

[HACKERS] Assorted leaks and weirdness in parallel execution

2017-08-31 Thread Tom Lane
I complained a couple weeks ago that nodeGatherMerge looked like it
leaked a lot of memory when commanded to rescan.  Attached are three
proposed patches that, in combination, demonstrably result in zero
leakage across repeated rescans.

The first thing I noticed when I started digging into this was that
there was some leakage in TopMemoryContext, which seemed pretty weird.
What it turned out to be was on_dsm_detach callback registration records.
This happens because, although the comments for shm_mq_attach() claim
that shm_mq_detach() will free the shm_mq_handle, it does no such thing,
and it doesn't worry about canceling the on_dsm_detach registration
either.  So over repeated attach/detach cycles, we leak shm_mq_handles
and also callback registrations.  This isn't just a memory leak: it
means that, whenever we finally do detach from the DSM segment, we'll
execute a bunch of shm_mq_detach() calls pointed at long-since-detached-
and-reused shm_mq structs.  That seems incredibly dangerous.  It manages
not to fail ATM because our stylized use of DSM means that a shm_mq
would only ever be re-used as another shm_mq; so the only real effect is
that our last counterparty process, if still attached, would receive N
SetLatch events not just one.  But it's going to crash and burn someday.

For extra fun, the error MQs weren't ever explicitly detached from,
just left to rot until on_dsm_detach time.  Although we did pfree the
shm_mq_handles out from under them.

So the first patch attached cleans this up by making shm_mq_detach
do what it was advertised to, ie fully reverse what shm_mq_attach
does.  That means it needs to take a shm_mq_handle, not a bare shm_mq,
but that actually makes the callers cleaner anyway.  (With this patch,
there are no callers of shm_mq_get_queue(); should we remove that?)

The second patch cleans up assorted garden-variety leaks when
rescanning a GatherMerge node, by having it allocate its work
arrays just once and then re-use them across rescans.

The last patch fixes the one remaining leak I saw after applying the
first two patches, namely that execParallel.c leaks the array palloc'd
by ExecParallelSetupTupleQueues --- just the array storage, not any of
the shm_mq_handles it points to.  The given patch just adds a pfree
to ExecParallelFinish, but TBH I find this pretty unsatisfactory.
It seems like a significant modularity violation that execParallel.c
is responsible for creating those shm_mqs but not for cleaning them up.
That cleanup currently happens as a result of DestroyTupleQueueReader
calls done by nodeGather.c or nodeGatherMerge.c.  I'm tempted to
propose that we should move both the creation and the destruction of
the TupleQueueReaders into execParallel.c; the current setup is not
just weird but requires duplicative coding in the Gather nodes.
(That would make it more difficult to do the early reader destruction
that nodeGather currently does, but I am not sure we care about that.)
Another thing that seems like a poor factorization choice is that
DestroyTupleQueueReader is charged with doing shm_mq_detach even though
tqueue.c did not do the shm_mq_attach ... should we rethink that?

Comments?

regards, tom lane

diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 17b1038..ce1b907 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -480,7 +480,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 			 */
 			any_registrations_failed = true;
 			pcxt->worker[i].bgwhandle = NULL;
-			pfree(pcxt->worker[i].error_mqh);
+			shm_mq_detach(pcxt->worker[i].error_mqh);
 			pcxt->worker[i].error_mqh = NULL;
 		}
 	}
@@ -612,7 +612,7 @@ DestroyParallelContext(ParallelContext *pcxt)
 			{
 TerminateBackgroundWorker(pcxt->worker[i].bgwhandle);
 
-pfree(pcxt->worker[i].error_mqh);
+shm_mq_detach(pcxt->worker[i].error_mqh);
 pcxt->worker[i].error_mqh = NULL;
 			}
 		}
@@ -861,7 +861,7 @@ HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg)
 
 		case 'X':/* Terminate, indicating clean exit */
 			{
-pfree(pcxt->worker[i].error_mqh);
+shm_mq_detach(pcxt->worker[i].error_mqh);
 pcxt->worker[i].error_mqh = NULL;
 break;
 			}
diff --git a/src/backend/executor/tqueue.c b/src/backend/executor/tqueue.c
index 4c4fcf5..cb262d8 100644
--- a/src/backend/executor/tqueue.c
+++ b/src/backend/executor/tqueue.c
@@ -578,7 +578,7 @@ tqueueShutdownReceiver(DestReceiver *self)
 {
 	TQueueDestReceiver *tqueue = (TQueueDestReceiver *) self;
 
-	shm_mq_detach(shm_mq_get_queue(tqueue->queue));
+	shm_mq_detach(tqueue->queue);
 }
 
 /*
@@ -650,7 +650,7 @@ CreateTupleQueueReader(shm_mq_handle *handle, TupleDesc tupledesc)
 void
 DestroyTupleQueueReader(TupleQueueReader *reader)
 {
-	shm_mq_detach(shm_mq_get_queue(reader->queue));
+	shm_mq_detach(reader->queue);
 	if (reader->typmodmap != NULL)
 		hash_destroy(reader->typmodmap);
 	/* Is it worth trying to 

Re: [HACKERS] Hooks to track changed pages for backup purposes

2017-08-31 Thread Michael Paquier
On Thu, Aug 31, 2017 at 3:02 PM, Andrey Borodin  wrote:
> Here is the patch with hooks that I consider sufficient for implementation of 
> incremental backup with pages tracking as extension.
>
> Recently I was posting these things to the thread "Adding hook in BufferSync 
> for backup purposes" [0], but here I start separate thread since Subj field 
> of previous discussion is technically wrong.
>
> Currently various incremental backups can use one of this methods to take 
> diff of a cluster since some LSN:
> 1. Check LSN of every page
> 2. Scan WAL and collect block numbers of changed pages
>
> I propose adding hooks:
> 1. When a buffer is registered in WAL insertion
> This hook is supposed to place blocknumbers in a temporary storage, like 
> backend-local static array.
> 2. When a WAL record insertion is started and finished, to transfer 
> blocknumbers to more concurrency-protected storage.
> 3. When the WAL segment is switched to initiate async transfer of accumulated 
> blocknumbers to durable storage.
>
> When we have accumulated diff blocknumbers for most of segments we can 
> significantly speed up method of WAL scanning. If we have blocknumbers for 
> all segments we can skip WAL scanning at all.
>
> I think that these proposed hooks can enable more efficient backups. How do 
> you think?
>
> Any ideas will be appreciated. This patch is influenced by the code of PTRACK 
> (Yury Zhuravlev and Postgres Professional).

+if (xlog_insert_buffer_hook)
+for(nblock = 0; nblock <
xlogreader->max_block_id; nblock++)
+{
+if(xlogreader->blocks[nblock].forknum == MAIN_FORKNUM)
+{
+
xlog_insert_buffer_hook(xlogreader->blocks[nblock].blkno,
+
xlogreader->blocks[nblock].rnode, true);
+}
+}
[...]
+if(xlog_begin_insert_hook)
+xlog_begin_insert_hook();
Such things are not Postgres-C like. The first one should use
brackets, and the second one proper spacing after "if".

I don't understand what xlog_begin_insert_hook() is good for. There
are no arguments fed to this hook, so modules would not be able to
analyze things in this context, except shared memory and process
state?

Those hooks are put in hot code paths, and could impact performance of
WAL insertion itself. So you basically move the cost of scanning WAL
segments for those blocks from any backup solution to the WAL
insertion itself. Really, wouldn't it be more simple to let for
example the archiver process to create this meta-data if you just want
to take faster backups with a set of segments? Even better, you could
do a scan after archiving N segments, and then use M jobs to do this
work more quickly. (A set of background workers could do this job as
well).

In the backup/restore world, backups can be allowed to be taken at a
slow pace, what matters is to be able to restore them quickly. In
short, anything moving performance from an external backup code path
to a critical backend code path looks like a bad design to begin with.
So I am dubious that what you are proposing here is a good idea.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multiple target of VACUUM command

2017-08-31 Thread Michael Paquier
On Thu, Aug 31, 2017 at 9:53 PM, Kyotaro HORIGUCHI
 wrote:
> I sometimes feel annoyed when trying to VACUUM multiple specific
> tables.
>
> postgres=# vacuum a, b;
> ERROR:  syntax error at or near ","
> LINE 1: vacuum a, b;
>
> This patch just allows multiple targets for VACUUM command.

There is a patch for the same feature by Nathan Bossart which is being
discussed already in this commit fest:
https://www.postgresql.org/message-id/e061a8e3-5e3d-494d-94f0-e8a9b312b...@amazon.com
It had already a couple of rounds of reviews, and is getting close to
something that could be committed. There is still a pending bug
related to the use of RangeVar though with autovacuum.

Your approach is missing a couple of points. For example when
specifying multiple targets, we have decided to check for an ERROR at
the beginning of VACUUM, but we are issuing a WARNING if it goes
missing in the middle of processing a list, so your set of patches
would provide a frustrating experience. We have also discussed about
reshaping a bit the API of vacuum(), so I would recommend looking at
what has been already proposed if you are interested.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generated columns

2017-08-31 Thread Greg Stark
On 31 August 2017 at 05:16, Peter Eisentraut
 wrote:
> Here is another attempt to implement generated columns.  This is a
> well-known SQL-standard feature, also available for instance in DB2,
> MySQL, Oracle.  A quick example:
>
>   CREATE TABLE t1 (
> ...,
> height_cm numeric,
> height_in numeric GENERATED ALWAYS AS (height_cm * 2.54)
>   );

I only recently discovered we actually already have this feature. Kind of.
stark=# CREATE TABLE t1 (height_cm numeric);
CREATE TABLE
Time: 38.066 ms
stark***=# create function height_in(t t1) returns numeric language
'sql' as 'select t.height_cm * 2.54' ;
CREATE FUNCTION
Time: 1.216 ms
stark***=# insert into t1 values (2);
INSERT 0 1
Time: 10.170 ms
stark***=# select t1.height_cm, t1.height_in from t1;
┌───┬───┐
│ height_cm │ height_in │
├───┼───┤
│ 2 │  5.08 │
└───┴───┘
(1 row)

Time: 1.997 ms

Yours looks better :)

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Hash take II

2017-08-31 Thread Thomas Munro
Here's a new rebased and debugged patch set.

On Tue, Aug 1, 2017 at 1:11 PM, Andres Freund  wrote:
> - Echoing concerns from other threads (Robert: ping): I'm doubtful that
>   it makes sense to size the number of parallel workers solely based on
>   the parallel scan node's size.  I don't think it's this patch's job to
>   change that, but to me it seriously amplifys that - I'd bet there's a
>   lot of cases with nontrivial joins where the benefit from parallelism
>   on the join level is bigger than on the scan level itself.  And the
>   number of rows in the upper nodes might also be bigger than on the
>   scan node level, making it more important to have higher number of
>   nodes.

Agreed that this is bogus.  The number of workers is really determined
by the outer path (the probe side), except that if the inner path (the
build side) is not big enough to warrant parallel workers at all then
parallelism is inhibited on that side.  That prevents small tables
from being loaded by Parallel Hash.  That is something we want, but
it's probably not doing it for the right reasons with the right
threshold -- about which more below.

> - If I understand the code in initial_cost_hashjoin() correctly, we
>   count the synchronization overhead once, independent of the number of
>   workers.  But on the other hand we calculate the throughput by
>   dividing by the number of workers.  Do you think that's right?

It's how long you think the average participant will have to wait for
the last participant to arrive, and I think that's mainly determined
by the parallel grain, not the number of workers.  If you're a work
that has reached the end of a scan, the best case is that every other
worker has already reached the end too and the worst case is that
another worker read the last granule (currently page) just before you
hit the end, so you'll have to wait for it to process a granule's
worth of work.

To show this I used dtrace to measure the number of microseconds spent
waiting at the barrier before probing while running a 5 million row
self-join 100 times, and got the following histograms:

1 worker:

   value  - Distribution - count
 < 0 | 0
   0 |@@   110
  20 | 1
  40 |@5
  60 |@24
  80 |@@@  14
 100 |@5
 120 |@@@  16
 140 |@@@  17
 160 |@@   8
 180 | 0

2 workers:

   value  - Distribution - count
 < 0 | 0
   0 |@@   107
  20 | 1
  40 |@@@  21
  60 |@@@  25
  80 |@@   16
 100 |@@   14
 120 |@38
 140 |@@@  51
 160 |@@@  20
 180 | 3
 200 | 1
 220 | 3
 240 | 0

3 workers:

   value  - Distribution - count
 < 0 | 0
   0 |@@@  113
  20 |@@   15
  40 |@@@  29
  60 | 35
  80 | 37
 100 |@@   56
 120 |@51
 140 |@@@  31
 160 |@@   21
 180 |@6

4 workers:

   value  - Distribution - count
 < 0 | 0
   0 |@@   121
  20 | 4
  40 |@@@  39
  60 |@@   29
  80 |@@

[HACKERS] multiple target of VACUUM command

2017-08-31 Thread Kyotaro HORIGUCHI
Hello,

I sometimes feel annoyed when trying to VACUUM multiple specific
tables.

postgres=# vacuum a, b;
ERROR:  syntax error at or near ","
LINE 1: vacuum a, b;

This patch just allows multiple targets for VACUUM command.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 481690f0d84a21db755a986a7f785e8bbbe0769e Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Jul 2017 13:30:59 +0900
Subject: [PATCH 1/2] Make VaccumStmt capable to have multiple table parameters

---
 src/backend/commands/vacuum.c  | 18 +++---
 src/backend/nodes/copyfuncs.c  | 13 +
 src/backend/nodes/equalfuncs.c | 11 +++
 src/backend/parser/gram.y  | 29 -
 src/include/nodes/nodes.h  |  1 +
 src/include/nodes/parsenodes.h | 10 --
 6 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index faa1812..d6cd352 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -85,12 +85,12 @@ void
 ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 {
 	VacuumParams params;
+	ListCell*lc;
 
 	/* sanity checks on options */
 	Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
 	Assert((vacstmt->options & VACOPT_VACUUM) ||
 		   !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)));
-	Assert((vacstmt->options & VACOPT_ANALYZE) || vacstmt->va_cols == NIL);
 	Assert(!(vacstmt->options & VACOPT_SKIPTOAST));
 
 	/*
@@ -119,8 +119,20 @@ ExecVacuum(VacuumStmt *vacstmt, bool isTopLevel)
 	params.log_min_duration = -1;
 
 	/* Now go through the common routine */
-	vacuum(vacstmt->options, vacstmt->relation, InvalidOid, ,
-		   vacstmt->va_cols, NULL, isTopLevel);
+	if (list_length(vacstmt->relcols) == 0)
+		vacuum(vacstmt->options, NULL, InvalidOid, ,
+			   NIL, NULL, isTopLevel);
+	else
+	{
+		foreach (lc, vacstmt->relcols)
+		{
+			VacRelCols *relcol = (VacRelCols *) lfirst(lc);
+			Assert((vacstmt->options & VACOPT_ANALYZE) ||
+   relcol->va_cols == NIL);
+			vacuum(vacstmt->options, relcol->relation, InvalidOid, ,
+   relcol->va_cols, NULL, isTopLevel);
+		}
+	}
 }
 
 /*
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 7204169..761f758 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3764,6 +3764,16 @@ _copyVacuumStmt(const VacuumStmt *from)
 	VacuumStmt *newnode = makeNode(VacuumStmt);
 
 	COPY_SCALAR_FIELD(options);
+	COPY_NODE_FIELD(relcols);
+
+	return newnode;
+}
+
+static VacRelCols *
+_copyVacRelCols(const VacRelCols *from)
+{
+	VacRelCols *newnode = makeNode(VacRelCols);
+
 	COPY_NODE_FIELD(relation);
 	COPY_NODE_FIELD(va_cols);
 
@@ -5527,6 +5537,9 @@ copyObjectImpl(const void *from)
 		case T_PartitionCmd:
 			retval = _copyPartitionCmd(from);
 			break;
+		case T_VacRelCols:
+			retval = _copyVacRelCols(from);
+			break;
 
 			/*
 			 * MISCELLANEOUS NODES
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8d92c03..aea7168 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1663,6 +1663,14 @@ static bool
 _equalVacuumStmt(const VacuumStmt *a, const VacuumStmt *b)
 {
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_NODE_FIELD(relcols);
+
+	return true;
+}
+
+static bool
+_equalVacRelCols(const VacRelCols *a, const VacRelCols *b)
+{
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_NODE_FIELD(va_cols);
 
@@ -3675,6 +3683,9 @@ equal(const void *a, const void *b)
 		case T_PartitionCmd:
 			retval = _equalPartitionCmd(a, b);
 			break;
+		case T_VacRelCols:
+			retval = _equalVacRelCols(a, b);
+			break;
 
 		default:
 			elog(ERROR, "unrecognized node type: %d",
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7d0de99..844c691 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10128,13 +10128,13 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose
 		n->options |= VACOPT_FREEZE;
 	if ($4)
 		n->options |= VACOPT_VERBOSE;
-	n->relation = NULL;
-	n->va_cols = NIL;
+	n->relcols = NIL;
 	$$ = (Node *)n;
 }
 			| VACUUM opt_full opt_freeze opt_verbose qualified_name
 {
 	VacuumStmt *n = makeNode(VacuumStmt);
+	VacRelCols *relcol = makeNode(VacRelCols);
 	n->options = VACOPT_VACUUM;
 	if ($2)
 		n->options |= VACOPT_FULL;
@@ -10142,8 +10142,9 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose
 		n->options |= VACOPT_FREEZE;
 	if ($4)
 		n->options |= VACOPT_VERBOSE;
-	n->relation = $5;
-	n->va_cols = NIL;
+	relcol->relation = $5;
+	relcol->va_cols = NIL;
+	n->relcols = list_make1(relcol);
 	$$ = (Node *)n;
 }
 			| VACUUM opt_full opt_freeze opt_verbose AnalyzeStmt
@@ -10162,18 +10163,19 @@ VacuumStmt: VACUUM opt_full opt_freeze opt_verbose
 {
 	VacuumStmt *n = makeNode(VacuumStmt);
 	n->options = VACOPT_VACUUM | $3;
-	

Re: [HACKERS] asynchronous execution

2017-08-31 Thread Kyotaro HORIGUCHI
At Thu, 03 Aug 2017 09:30:57 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170803.093057.261590619.horiguchi.kyot...@lab.ntt.co.jp>
> > Unfortunately, that's probably another gigantic patch (that
> > should probably be written by Andres).
> 
> Yeah, but async executor on the current style of executor seems
> furtile work, or sitting until the patch comes is also waste of
> time. So I'm planning to include the following sutff in the next
> PoC patch. Even I'm not sure it can land on the coming
> Andres'patch.
> 
> - Tuple passing outside call-stack. (I remember it was in the
>   past of the thread around but not found)
> 
>   This should be included in the Andres' patch.
> 
> - Give executor an ability to run from data-source (or driver)
>   nodes to the root.
> 
>   I'm not sure this is included, but I suppose he is aiming this
>   kind of thing.
> 
> - Rebuid asynchronous execution on the upside-down executor.

Anyway, I modified ExecProcNode into push-up form and it *seems*
working to some extent. But trigger and cursors are almost broken
and several other regressions fail. Some nodes such like
windowagg are terriblly difficult to change to this push-up form
(using state machine). And of course it is terribly inefficient.

I'm afraid that all of this turns out to be in vain. But anyway,
and FWIW, I'll show the work to here after some cleansing work on
it.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_destination=file

2017-08-31 Thread Tom Lane
Magnus Hagander  writes:
> On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane  wrote:
>> Right, because the decision whether to redirect stdout/stderr can't
>> be changed on the fly.

> Right.

> We could of course also say we only care about things generated by our
> ereport framework, in which case we don't need to redirect stderr and can
> just use a "regular pipe". But IIRC that's functionality we also
> specifically wanted (though I don't think I've ever needed it myself,
> presumably others have).

Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway.  Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython.  I don't find this to be negotiable.

> Are you actually asking for a benchmark of if logging gets slower?

Yes.

> If so,
> could you suggest a workload to make an actual benchmark of it (where
> logging would be high enough that it could be come a bottleneck -- and not
> writing the log data to disk, but the actual logging). I'm not sure what a
> good one would be.

pgbench with log_statement = all would be a pretty easy test case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-31 Thread amul sul
On Wed, Aug 30, 2017 at 9:05 PM, Robert Haas  wrote:

> On Wed, Aug 30, 2017 at 10:43 AM, amul sul  wrote:
> > Thanks for the suggestion, I have updated 0002-patch accordingly.
> > Using this I found some strange behaviours as follow:
> >
> > 1) standard and extended0 output for the jsonb_hash case is not same.
> > 2) standard and extended0 output for the hash_range case is not same when
> > input
> >is int4range(550274, 1550274)  other case in the patch are fine. This
> can
> > be
> >reproducible with other values as well, for e.g. int8range(1570275,
> > 208112489).
> >
> > Will look into this tomorrow.
>
> Those sound like bugs in your patch.  Specifically:
>
> +/* Same approach as hash_range */
> +result = hash_uint32_extended((uint32) flags, seed);
> +result ^= lower_hash;
> +result = (result << 1) | (result >> 63);
> +result ^= upper_hash;
> ​
>
​
Yes, you are correct.​


> ​
>
> That doesn't give compatible results.  The easiest thing to do might
> be to rotate the high 32 bits and the low 32 bits separately.
> JsonbHashScalarValueExtended has the same problem.  Maybe create a
> helper function that does something like this (untested):
>
> ((x << 1) & UINT64COUNT(0xfffefffe)) | ((x >> 31) &
> UINT64CONST(0x10001))
> ​
>
​
This working as expected,  also tested by executing the following SQL
multiple times:

SELECT v as value, hash_range(v)::bit(32) as standard,
   hash_range_extended(v, 0)::bit(32) as extended0,
   hash_range_extended(v, 1)::bit(32) as extended1
FROM   (VALUES (int8range(floor(random() * 100)::int8, floor(random() *
1000)::int8)),
(int8range(floor(random() * 1000)::int8, floor(random() *
1)::int8)),
(int8range(floor(random() * 1)::int8, floor(random() *
10)::int8)),
 (int8range(floor(random() * 1000)::int8, floor(random() *
1)::int8)),
 (int8range(floor(random() * 1)::int8, floor(random() *
10)::int8))) x(v)
WHERE  hash_range(v)::bit(32) != hash_range_extended(v, 0)::bit(32)
   OR hash_range(v)::bit(32) = hash_range_extended(v, 1)::bit(32);  ​



> > Another case which I want to discuss is, extended and standard version of
> > hashfloat4, hashfloat8 & hash_numeric function will have the same output
> for
> > zero
> > value irrespective of seed value. Do you think we need a fix for this?
>
> Yes, I think you should return the seed rather than 0 in the cases
> where the current code hard-codes a 0 return.  That will give the same
> behavior in the seed == 0 case while cheaply getting us something a
> bit different when there is a seed.
>
> ​Fixed in the attached version.​



> BTW, you should probably invent and use a PG_RETURN_UINT64 macro in this
> patch.
>
> Added i
n the attached version
​.​

Thanks for your all the suggestions.

Regards,
Amul


0001-add-optional-second-hash-proc-v4.patch
Description: Binary data


0002-test-Hash_functions_v3_wip.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_destination=file

2017-08-31 Thread Magnus Hagander
On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > My understanding is that the main reason for this is that we cannot
> change
> > logging_collector without restarting postmaster, whereas we can change
> > log_destination.
>
> Right, because the decision whether to redirect stdout/stderr can't
> be changed on the fly.
>

Right.

We could of course also say we only care about things generated by our
ereport framework, in which case we don't need to redirect stderr and can
just use a "regular pipe". But IIRC that's functionality we also
specifically wanted (though I don't think I've ever needed it myself,
presumably others have).




> > My suggestion is we work around this by just always starting the logging
> > collector, even if we're not planning to use it.
>
> Umm
>
> > Do people see an actual problem with that? I agree it's an extra round of
> > indirection there, but is that a problem? It would also cause one more
> > backgorund process to always be running, but again, is that really a
> > problem? The overhead is not exactly large.
>
> You just made three assertions about "this isn't a problem" without
> providing any evidence in support of any of them.  Maybe with some
> measurements we could have a real discussion.
>

Well, not entirely. The first two are questions "is this really a problem".

The overhead of an extra process certainly isn't much, I'm wiling to say
that as an assertion.

The other two, less so, that's more question.

Are you actually asking for a benchmark of if logging gets slower? If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and not
writing the log data to disk, but the actual logging). I'm not sure what a
good one would be.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] pgbench tap tests & minor fixes.

2017-08-31 Thread Fabien COELHO


Hello Nikolay,

Thanks for the review!

As for function names, committers can have their say. I'm somehow not 
dissatisfied with the current version, but I also agree with you that they 
are imperfect.


As for included bug fixes, I can do separate patches, but I think that it 
is enough to first commit the pgbench files and then the tap-test files, 
in that order. I'll see what committers say.


When/if this patch is committed, it should enable to add more tests quite 
easily to the numerous pgbench patches already in the pipe for quite some 
time... In particular, adding the new functions and operators to pgbench 
expressions patch is waiting for this to go to ready to committers.


A further caveat to committers: I'm unsure about what happens on Windows. 
I've done my best so that it should work.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_destination=file

2017-08-31 Thread Tom Lane
Magnus Hagander  writes:
> My understanding is that the main reason for this is that we cannot change
> logging_collector without restarting postmaster, whereas we can change
> log_destination.

Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.

> My suggestion is we work around this by just always starting the logging
> collector, even if we're not planning to use it.

Umm

> Do people see an actual problem with that? I agree it's an extra round of
> indirection there, but is that a problem? It would also cause one more
> backgorund process to always be running, but again, is that really a
> problem? The overhead is not exactly large.

You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them.  Maybe with some
measurements we could have a real discussion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes.

2017-08-31 Thread Nikolay Shaplov
Hi All!

I am about to set "Ready for commit" status to this patch. So there is my 
summary for commiter, so one does not need to carefully read all the thread.

This patch is consists of three parts. May be they should be commited 
separately, then Fabien will split them, I think.

1. The tests.

Tests are quite good. May be I myself would write them in another style, but 
"there is more than one way to do it" in perl, you know. 
These test covers more than 90% of C code of pgbench, which is good. (I did 
not check this myself, but see no reason not to trust Fabien)

The most doubtful part of the patch is the following: the patch introduce 
command_checks_all function in TestLib.pm that works like command_like 
function but also allows to check STDERR output and exit status.

First: I have some problem with the name, I would call it command_like_more or 
something similar, but I decided to leave it for commiter to make final choice.

Second: I think that command_checks and command_like do very similar things. I 
think that later all lests should be rewritten to use command_checks, and get 
rid of command_like, And I do not know how to put this better in the 
developing workflow. May be it should be another patch after this one is 
commited.

2. Patch for -t/-R/-L case.

Current pgbench does not process -t/-R/-L case correctly. This was also fixed 
in this patch. 

Now if you set number of transactions using -t/--transactions, combining with 
-R/--rate or -L/--latency-limit, you can be sure there would be not more than 
were specified in -t and they are properly counted.

This part is quite clear

3. \n process in process_backslash_command error output

process_backslash_command raises an error if there are some problems with 
backslash commands, and prints the command that has error.

But it considers that there is always \n symbol at the end of the command and 
just chop it out. But when the backslash command is at the end of the file, 
there is no \n at the end of line.

So this patch introduces expr_scanner_chomp_substring function that works just 
like expr_scanner_get_substring but it skips \n or \r\n symbols at the end of 
line.

I still have some problems with function name here, but have no idea how to do 
it better.



So that's it. I hope my involvement in the review process were useful. Will be 
happy to see this patch commited into master :-)

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-31 Thread Magnus Hagander
On Thu, Aug 17, 2017 at 2:19 AM, Craig Ringer  wrote:

> On 17 August 2017 at 07:30, Michael Paquier 
> wrote:
>
>>
>> Definitely agreed on that. Any move function would need to check if
>> the WAL position given by caller is already newer than what's
>> available in the local pg_wal (minimum of all other slots), with a
>> shared lock that would need to be taken by xlog.c when recycling past
>> segments. A forward function works on a single entry, which should be
>> disabled at the moment of the update. It looks dangerous to me to do
>> such an operation if there is a consumer of a slot currently on it.
>>
>>
> pg_advance_replication_slot(...)
>
> ERROR's on logical slot, for now. Physical slots only.
>
> Forward-only.
>
> Future work to allow it to use the logical decoding infrastructure to
> fast-forward a slot by reading only catalog change information and
> invalidations, either via a dummy output plugin that filters out all xacts,
> or by lower level use of the decoding code.
>
> Reasonable?
>
>
PFA an updated and rebased patch.

Rebased. Now named pg_advance_replication_slot. ERROR on logical slots.
Forward only.

I think that, in the end, covered all the comments?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 641b3b8f4e..452559f260 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19030,6 +19030,24 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 be called when connected to the same database the slot was created on.

   
+  
+   
+
+ pg_advance_replication_slot
+
+pg_advance_replication_slot(slot_name name, position pg_lsn)
+   
+   
+bool
+   
+   
+Advances the current restart position of a physical
+replication slot named slot_name.
+The slot will not be moved backwards, and it will not be
+moved beyond the current insert location. Logical replication
+slots cannot be moved. Returns true if the position was changed.
+   
+  
 
   

diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index d4cbd83bde..0e27e739bf 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -177,6 +177,73 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 }
 
 /*
+ * SQL function for moving the position in a replication slot.
+ */
+Datum
+pg_advance_replication_slot(PG_FUNCTION_ARGS)
+{
+	Name		slotname = PG_GETARG_NAME(0);
+	XLogRecPtr	moveto = PG_GETARG_LSN(1);
+	bool		changed = false;
+	bool 		backwards = false;
+
+	Assert(!MyReplicationSlot);
+
+	check_permissions();
+
+	if (XLogRecPtrIsInvalid(moveto))
+		ereport(ERROR,
+(errmsg("Invalid target xlog position ")));
+
+	/* Temporarily acquire the slot so we "own" it */
+	ReplicationSlotAcquire(NameStr(*slotname), true);
+
+	/* Only physical slots can be moved */
+	if (MyReplicationSlot->data.database != InvalidOid)
+	{
+		ReplicationSlotRelease();
+		ereport(ERROR,
+(errmsg("Only physical replication slots can be advanced.")));
+	}
+
+	if (moveto > GetXLogWriteRecPtr())
+		/* Can't move past current position, so truncate there */
+		moveto = GetXLogWriteRecPtr();
+
+	/* Now adjust it */
+	SpinLockAcquire(>mutex);
+	if (MyReplicationSlot->data.restart_lsn != moveto)
+	{
+		/* Never move backwards, because bad things can happen */
+		if (MyReplicationSlot->data.restart_lsn > moveto)
+			backwards = true;
+		else
+		{
+			MyReplicationSlot->data.restart_lsn = moveto;
+			changed = true;
+		}
+	}
+	SpinLockRelease(>mutex);
+
+	if (backwards)
+		ereport(WARNING,
+(errmsg("Not moving replication slot backwards!")));
+
+
+	if (changed)
+	{
+		ReplicationSlotMarkDirty();
+		ReplicationSlotsComputeRequiredLSN();
+		ReplicationSlotSave();
+	}
+
+	ReplicationSlotRelease();
+
+	PG_RETURN_BOOL(changed);
+}
+
+
+/*
  * pg_get_replication_slots - SQL SRF showing active replication slots.
  */
 Datum
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8b33b4e0ea..3495447cf9 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -5293,6 +5293,8 @@ DATA(insert OID = 3779 (  pg_create_physical_replication_slot PGNSP PGUID 12 1 0
 DESCR("create a physical replication slot");
 DATA(insert OID = 3780 (  pg_drop_replication_slot PGNSP PGUID 12 1 0 0 0 f f f f t f v u 1 0 2278 "19" _null_ _null_ _null_ _null_ _null_ pg_drop_replication_slot _null_ _null_ _null_ ));
 DESCR("drop a replication slot");
+DATA(insert OID = 3998 (  pg_advance_replication_slot PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 16 "19 3220" _null_ _null_ _null_ _null_ _null_ pg_advance_replication_slot _null_ _null_ _null_ ));
+DESCR("move a replication slot position");
 DATA(insert OID = 

Re: [HACKERS] More replication race conditions

2017-08-31 Thread Simon Riggs
On 27 August 2017 at 03:32, Noah Misch  wrote:
> On Fri, Aug 25, 2017 at 12:09:00PM +0200, Petr Jelinek wrote:
>> On 24/08/17 19:54, Tom Lane wrote:
>> > sungazer just failed with
>> >
>> > pg_recvlogical exited with code '256', stdout '' and stderr 
>> > 'pg_recvlogical: could not send replication command "START_REPLICATION 
>> > SLOT "test_slot" LOGICAL 0/0 ("include-xids" '0', "skip-empty-xacts" 
>> > '1')": ERROR:  replication slot "test_slot" is active for PID 8913148
>> > pg_recvlogical: disconnected
>> > ' at 
>> > /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/recovery/../../../src/test/perl/PostgresNode.pm
>> >  line 1657.
>> >
>> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2017-08-24%2015%3A16%3A10
>> >
>> > Looks like we're still not there on preventing replication startup
>> > race conditions.
>>
>> Hmm, that looks like "by design" behavior. Slot acquiring will throw
>> error if the slot is already used by somebody else (slots use their own
>> locking mechanism that does not wait). In this case it seems the
>> walsender which was using slot for previous previous step didn't finish
>> releasing the slot by the time we start new command. We can work around
>> this by changing the test to wait perhaps.
>
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Simon,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
>
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

I acknowledge receipt of the reminder and will fix by end of day tomorrow.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] log_destination=file

2017-08-31 Thread Magnus Hagander
Attached is a very much VIP (AKA rough draft) for $subject.

Right now we have two parameters controlling logging destination, and they
work in interesting combinations:

log_destination=stderr, logging_collector=off -> log to stderr (makes sense)
log_destination=stderr, logging_collector=on  -> log to file (*highly*
illogical for most users, to set stderr when they want file)
log_destination=csvlog, logging_collector=on -> log to csvfile (makes sense)
log_destination=csvlog, logging_collector=off -> fail (ugh)

(beyond that we also have syslog and eventlog, but those are different and
not touched by this patch)

My understanding is that the main reason for this is that we cannot change
logging_collector without restarting postmaster, whereas we can change
log_destination.

My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it. Then we'd just have:

log_destination = stderr -> log to stderr
log_destination = file -> log to file
log_destination = csvlog -> log to csv file

The main difference here is that log_destination=stderr would also go
through the logging collector which would then assemble it and write it out
to it's own stderr.

Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.

It would make configuration a lot more logical for logging. It would also
make it possible to switch between all logging configurations without
restarting.

The attached WIP is mostly working for the simple cases. It fails
completely if the syslogger is restarted at this point, simply because I
haven't figured out how to pass the original stderr down to the syslogger.
I'm sure there are also many other smaller issues around it, but I wanted
to get the discussion done before I spend the time to go through those.

Thoughts?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 95180b2ef5..7c4eb5743f 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1764,7 +1764,7 @@ ServerLoop(void)
 		}
 
 		/* If we have lost the log collector, try to start a new one */
-		if (SysLoggerPID == 0 && Logging_collector)
+		if (SysLoggerPID == 0)
 			SysLoggerPID = SysLogger_Start();
 
 		/*
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 3255b42c7d..15fc606d24 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -57,10 +57,8 @@
 
 
 /*
- * GUC parameters.  Logging_collector cannot be changed after postmaster
- * start, but the rest can change at SIGHUP.
+ * GUC parameters. Can change at SIGHUP.
  */
-bool		Logging_collector = false;
 int			Log_RotationAge = HOURS_PER_DAY * MINS_PER_HOUR;
 int			Log_RotationSize = 10 * 1024;
 char	   *Log_directory = NULL;
@@ -182,6 +180,10 @@ SysLoggerMain(int argc, char *argv[])
 	 * assumes that all interesting messages generated in the syslogger will
 	 * come through elog.c and will be sent to write_syslogger_file.
 	 */
+	/*
+	 * XXX: this does not work when we want to use stderr in the syslogger.
+	 * needs fixing!
+	 */
 	if (redirection_done)
 	{
 		int			fd = open(DEVNULL, O_WRONLY, 0);
@@ -370,10 +372,11 @@ SysLoggerMain(int argc, char *argv[])
 		if (!rotation_requested && Log_RotationSize > 0 && !rotation_disabled)
 		{
 			/* Do a rotation if file is too big */
-			if (ftell(syslogFile) >= Log_RotationSize * 1024L)
+			if (syslogFile != NULL &&
+ftell(syslogFile) >= Log_RotationSize * 1024L)
 			{
 rotation_requested = true;
-size_rotation_for |= LOG_DESTINATION_STDERR;
+size_rotation_for |= LOG_DESTINATION_FILE;
 			}
 			if (csvlogFile != NULL &&
 ftell(csvlogFile) >= Log_RotationSize * 1024L)
@@ -390,7 +393,7 @@ SysLoggerMain(int argc, char *argv[])
 			 * was sent by pg_rotate_logfile.
 			 */
 			if (!time_based_rotation && size_rotation_for == 0)
-size_rotation_for = LOG_DESTINATION_STDERR | LOG_DESTINATION_CSVLOG;
+size_rotation_for = LOG_DESTINATION_FILE | LOG_DESTINATION_CSVLOG;
 			logfile_rotate(time_based_rotation, size_rotation_for);
 		}
 
@@ -522,9 +525,6 @@ SysLogger_Start(void)
 	pid_t		sysloggerPid;
 	char	   *filename;
 
-	if (!Logging_collector)
-		return 0;
-
 	/*
 	 * If first time through, create the pipe which will receive stderr
 	 * output.
@@ -581,7 +581,10 @@ SysLogger_Start(void)
 	first_syslogger_file_time = time(NULL);
 	filename = logfile_getname(first_syslogger_file_time, NULL);
 
-	syslogFile = logfile_open(filename, "a", false);
+	if (Log_destination & LOG_DESTINATION_FILE)
+		syslogFile = logfile_open(filename, "a", false);
+	else
+		syslogFile = NULL;
 

[HACKERS] CurrentUserId may be invalid during the rest of a session

2017-08-31 Thread Richard Guo
Hi,

During the first transaction starting phase within a backend, if there is
an 'ereport' after setting transaction state but before
saving CurrentUserId into 'prevUser' in 'TransactionStateData',
CurrentUserId will be invalid in the rest of the session.

Take branch 'REL9_6_STABLE' for example:

1797 static void
1798 StartTransaction(void)
1799 {
1800 TransactionState s;

 ..

1822 s->state = TRANS_START;

   *<===
'ereport' in this window*
 ..

1909 GetUserIdAndSecContext(>prevUser, >prevSecContext);

 ..

1927 }

If 'ereport' occurs in the described window, CurrentUserId will have no
chance to be saved into 'prevUser' and 'prevUser' will remain to be
InvalidOid as this is the first transaction of the session.

As transaction state has been set to be TRANS_START, 'AbortTransaction'
will be called then and CurrentUserId will be restored with 'prevUser',
which is InvalidOid. So in the rest of the session, CurrentUserId will be
invalid.

The invalid CurrentUserId may cause assertion failure or other issues, for
example:

(gdb) bt
#0  0x7f3d8ced9495 in raise () from /lib64/libc.so.6
#1  0x7f3d8cedac75 in abort () from /lib64/libc.so.6
#2  0x0095fdbd in ExceptionalCondition (conditionName=0xb72838
"!(((bool) ((CurrentUserId) != ((Oid) 0", errorType=0xb726ff "BadState",
fileName=0xb726c0 "miscinit.c", lineNumber=284) at assert.c:54
#3  0x00971b88 in GetUserId () at miscinit.c:284
#4  0x005559c4 in recomputeNamespacePath () at namespace.c:3496
#5  0x00551d53 in RelnameGetRelid (relname=0x1d3f288 "t1") at
namespace.c:673
#6  0x005514a7 in RangeVarGetRelidExtended (relation=0x1d3f2a8,
lockmode=1, missing_ok=1 '\001', nowait=0 '\000', callback=0x0,
callback_arg=0x0)
at namespace.c:326

Is this expected behavior?

Thanks
-Richard


Re: [HACKERS] Range Merge Join v1

2017-08-31 Thread Jeff Davis
On Fri, Aug 25, 2017 at 10:19 AM, Alexander Kuzmenkov
 wrote:
> Hi Jeff,

Hi,

Thank you for the review and suggestions!

> * At the moment, "mergejoinable clause" and "equality clause" mean the same
> thing to the planner, and those clauses are used both to create equivalence
> classes and to perform merge joins. If we start performing merge joins for
> different kinds of clauses, such as comparison or range intersection, it
> makes sense to separate these two meanings. I tried this in my patch and it
> didn't require many changes. I use RestrictInfo.equivopfamilies to build
> equivalence classes, and RestrictInfo.mergeopfamilies for mergejoinable
> clauses.

I like the idea. I will look in more detail and I can either change my
patch or piggyback on yours.

> * Semantically, MJCompare() is a function that determines whether you should
> emit a join result or advance one of the pointers. This meaning is not
> explicit in the code and is not well encapsulated: the function returns and
> int and 'compareNoMatch' flag, and the calling code combines them in various
> ways to derive the final result. This meaning can be made explicit by making
> MJCompare return enum values {Join, NextInner, NextOuter}, and putting
> inside it all the logic that decides whether we join or advance.
> ExecMergeJoin looks intimidating already, and I think this change would help
> readability. Again, you can find an illustration in my patch.

I actually tried doing something similar in my patch, but there are
four cases to consider in EXEC_MJ_SKIP_TEST:

1. Overlaps: mark and then join the tuples
2. left-of: SKIPOUTER_ADVANCE
3. right-of: SKIPINNER_ADVANCE
4. None of the above: mark and NEXTINNER

However, you are right that the current code is ugly, and I should
refactor into 4 explicit cases. I don't think I will call them by the
same names as the join states, though, because there's not a direct
mapping.

Updated patch attached. Changelog:

* Rebased
* Changed MJCompare to return an enum as suggested, but it has 4
possible values rather than 3.
* Added support for joining on contains or contained by (@> or <@) and
updated tests.

Regards,
Jeff Davis
diff --git a/doc/src/sgml/rangetypes.sgml b/doc/src/sgml/rangetypes.sgml
index 9557c16..84578a7 100644
*** a/doc/src/sgml/rangetypes.sgml
--- b/doc/src/sgml/rangetypes.sgml
***
*** 522,525  INSERT 0 1
--- 522,595 
  

   
+  
+   Range Join
+ 
+   
+ range type
+ join
+   
+ 
+   
+ A Range Join is a join where one of the join conditions is the range
+ overlaps operator (see ). In other
+ words, rather than returning rows where corresponding fields are equal, it
+ returns rows where the corresponding fields overlap.
+   
+   
+ For instance, to find users who were active on a website at the same time
+ as they were using a mobile app, you might issue a query like:
+ 
+ CREATE TABLE web_session(
+ web_session_id text primary key,
+ username text,
+ during tstzrange);
+ CREATE TABLE app_session(
+ app_session_id text primary key,
+ username text,
+ during tstzrange);
+ INSERT INTO web_session VALUES
+ ('0cc175b9c0f1b6a8', 'user1', '[2017-02-01 09:30, 2017-02-01 10:45)'),
+ ('92eb5ffee6ae2fec', 'user2', '[2017-02-01 09:30, 2017-02-01 10:45)'),
+ ('4a8a08f09d37b737', 'user3', '[2017-02-01 09:30, 2017-02-01 10:45)');
+ INSERT INTO app_session VALUES
+ ('8277e0910d750195', 'user1', '[2017-02-01 10:30, 2017-02-01 11:45)'),
+ ('b448797616e091ad', 'user3', '[2017-02-01 09:00, 2017-02-01 11:00)'),
+ ('95649038408b5f33', 'user4', '[2017-02-01 09:30, 2017-02-01 10:45)');
+ 
+ SELECT w.username,
+w.during * a.during AS both_during
+ FROM  web_session w, app_session a
+ WHERE w.username = a.username
+ AND w.during && a.during;
+  username | both_during 
+ --+-
+  user1| ["2017-02-01 10:30:00-08","2017-02-01 10:45:00-08")
+  user3| ["2017-02-01 09:30:00-08","2017-02-01 10:45:00-08")
+ (2 rows)
+ 
+   
+   
+ In addition to the general execution strategies available, Postgres also
+ has special support for a variant of Merge Join called Range Merge Join:
+ 
+  EXPLAIN (costs off) SELECT w.username,
+w.during * a.during AS both_during
+ FROM  web_session w, app_session a
+ WHERE w.username = a.username
+ AND w.during && a.during;
+   QUERY PLAN  
+ --
+  Range Merge Join
+Merge Cond: ((w.during && a.during) AND (w.username = a.username))
+->  Sort
+  Sort Key: w.during, w.username
+  ->  Seq Scan on web_session w
+->  Sort
+  Sort Key: a.during, a.username
+  ->  Seq Scan on app_session a
+ (8 rows)
+ 
+   
+  
  
diff --git 

Re: [HACKERS] UPDATE of partition key

2017-08-31 Thread Amit Khandekar
Thanks Dilip. I am working on rebasing the patch. Particularly, the
partition walker in my patch depended on the fact that all the tables
get opened (and then closed) while creating the tuple routing info.
But in HEAD, now only the partitioned tables get opened. So need some
changes in my patch.

The partition walker related changes are going to be inapplicable once
the other thread [1] commits the changes for expansion of inheritence
in bound-order, but till then I would have to rebase the partition
walker changes over HEAD.

[1] 
https://www.postgresql.org/message-id/0118a1f2-84bb-19a7-b906-dec040a206f2%40lab.ntt.co.jp


On 31 August 2017 at 12:09, Dilip Kumar  wrote:
> On Fri, Aug 11, 2017 at 10:44 AM, Amit Khandekar  
> wrote:
>> On 4 August 2017 at 22:28, Amit Khandekar  wrote:

>
> I am planning to review and test this patch, Seems like this patch
> needs to be rebased.
>
> [dilip@localhost postgresql]$ patch -p1 <
> ../patches/update-partition-key_v15.patch
> patching file doc/src/sgml/ddl.sgml
> patching file doc/src/sgml/ref/update.sgml
> patching file doc/src/sgml/trigger.sgml
> patching file src/backend/catalog/partition.c
> Hunk #3 succeeded at 910 (offset -1 lines).
> Hunk #4 succeeded at 924 (offset -1 lines).
> Hunk #5 succeeded at 934 (offset -1 lines).
> Hunk #6 succeeded at 994 (offset -1 lines).
> Hunk #7 succeeded at 1009 with fuzz 1 (offset 3 lines).
> Hunk #8 FAILED at 1023.
> Hunk #9 succeeded at 1059 with fuzz 2 (offset 10 lines).
> Hunk #10 succeeded at 2069 (offset 2 lines).
> Hunk #11 succeeded at 2406 (offset 2 lines).
> 1 out of 11 hunks FAILED -- saving rejects to file
> src/backend/catalog/partition.c.rej
> patching file src/backend/commands/copy.c
> Hunk #2 FAILED at 1426.
> Hunk #3 FAILED at 1462.
> Hunk #4 succeeded at 2616 (offset 7 lines).
> Hunk #5 succeeded at 2726 (offset 8 lines).
> Hunk #6 succeeded at 2846 (offset 8 lines).
> 2 out of 6 hunks FAILED -- saving rejects to file
> src/backend/commands/copy.c.rej
> patching file src/backend/commands/trigger.c
> Hunk #4 succeeded at 5261 with fuzz 2.
> patching file src/backend/executor/execMain.c
> Hunk #1 succeeded at 65 (offset 1 line).
> Hunk #2 succeeded at 103 (offset 1 line).
> Hunk #3 succeeded at 1829 (offset 20 lines).
> Hunk #4 succeeded at 1860 (offset 20 lines).
> Hunk #5 succeeded at 1927 (offset 20 lines).
> Hunk #6 succeeded at 2044 (offset 21 lines).
> Hunk #7 FAILED at 3210.
> Hunk #8 FAILED at 3244.
> Hunk #9 succeeded at 3289 (offset 26 lines).
> Hunk #10 FAILED at 3340.
> Hunk #11 succeeded at 3387 (offset 29 lines).
> Hunk #12 succeeded at 3424 (offset 29 lines).
> 3 out of 12 hunks FAILED -- saving rejects to file
> src/backend/executor/execMain.c.rej
> patching file src/backend/executor/execReplication.c
> patching file src/backend/executor/nodeModifyTable.c
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-31 Thread Amit Langote
On 2017/08/31 4:45, Robert Haas wrote:
> On Wed, Aug 30, 2017 at 12:47 PM, Ashutosh Bapat
>  wrote:
>> +1. I think we should just pull out the OIDs from partition descriptor.
> 
> Like this?  The first patch refactors the expansion of a single child
> out into a separate function, and the second patch implements EIBO on
> top of it.
> 
> I realized while doing this that we really want to expand the
> partitioning hierarchy depth-first, not breadth-first.  For some
> things, like partition-wise join in the case where all bounds match
> exactly, we really only need a *predictable* ordering that will be the
> same for two equi-partitioned tables.  A breadth-first expansion will
> give us that.  But it's not actually in bound order.  For example:
> 
> create table foo (a int, b text) partition by list (a);
> create table foo1 partition of foo for values in (2);
> create table foo2 partition of foo for values in (1) partition by range (b);
> create table foo2a partition of foo2 for values from ('b') to ('c');
> create table foo2b partition of foo2 for values from ('a') to ('b');
> create table foo3 partition of foo for values in (3);
> 
> The correct bound-order expansion of this is foo2b - foo2a - foo1 -
> foo3, which is indeed what you get with the attached patch.  But if we
> did the expansion in breadth-first fashion, we'd get foo1 - foo3 -
> foo2a, foo2b, which is, well, not in bound order.  If the idea is that
> you see a > 2 and rule out all partitions that appear before the first
> one with an a-value >= 2, it's not going to work.

I think, overall, this might be a good idea.  Thanks for working on it.

The patches I posted in the "path toward faster partition pruning" achieve
the same end result as your patch that the leaf partitions appear in the
partition bound order in the Append path for a partitioned table.  It
achieves that result in a somewhat different way, but let's forget about
that for a moment.  One thing the patch on that thread didn't achieve
though is getting the leaf partitions in the same (partition bound) order
in the ModifyTable path for UPDATE/DELETE, because inheritance_planner()
path is not modified in a suitable way (in fact, I'm afraid that there
might be a deadlock bug lurking there, which I must address).

Your patch, OTOH, achieves the same order in both cases, which seems
desirable.

> Mind you, that idea has some problems anyway in the face of default
> partitions, null partitions, and list partitions which accept
> non-contiguous values (e.g. one partition for 1, 3, 5; another for 2,
> 4, 6).  We might need to mark the PartitionDesc to indicate whether
> PartitionDesc-order is in fact bound-order in a particular instance,
> or something like that.

ISTM, the primary motivation for the EIBO patch at this point is to get
the partitions ordered in a predictable manner so that the partition-wise
join patch and update partition key patches could implement certain logic
using O (n) algorithm rather than an O (n^2) one.  Neither of them depend
on the actual order in the sense of, say, sticking a PathKey to the
resulting Append.  Perhaps, the patch to"Make the optimiser aware of
partitions ordering" [1] will have to consider this somehow; maybe by
limiting its scope to only the cases where the root partitioned table is
range partitioned.

Thanks,
Amit

[1] https://commitfest.postgresql.org/14/1093/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-31 Thread Fabien COELHO


Hello Masahiko-san,

[...] Personally I prefer "t" for table creation because "c" for create 
is a generic word. We might want to have another initialization command 
that creates something.


Ok, good point.


About the patch: applies, compiles, works for me. A few minor comments.

While re-reading the documentation, I think that it should be "Set custom 
initialization steps". It could be "Require ..." when -I implied -i, but 
since -i is still required the sentence does not seem to apply as such.


"Destroying any existing tables: ..." -> "Destroy existing pgbench tables: 
...".


I would suggest to add short expanded explanations in the term definition, 
next to the triggering letter, to underline the mnemonic. Something like:


   c (cleanup)
   t (table creation)
   g (generate data)
   v (vacuum)
   p (primary key)
   f (foreign key)

Also update the error message in checkCustomCommands to "ctgvpf".

Cleanup should have a message when it is executed. I suggest "cleaning 
up...".


Maybe add a comment in front of the array tables to say that the order is 
important, something like "tables in reverse foreign key dependencies 
order"?


case 'I': ISTM that initialize_cmds is necessarily already allocated, thus 
I would not bother to test before pg_free.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-08-31 Thread Masahiko Sawada
On Thu, Aug 31, 2017 at 10:52 AM, Bossart, Nathan  wrote:
> On 8/30/17, 5:37 PM, "Michael Paquier"  wrote:
>> +VacuumRelation *
>> +makeVacuumRelation(RangeVar *relation, List *va_cols, Oid oid)
>> +{
>> +   VacuumRelation *vacrel = makeNode(VacuumRelation);
>> +   vacrel->relation = relation;
>> +   vacrel->va_cols = va_cols;
>> +   vacrel->oid = oid;
>> +   return vacrel;
>> +}
>> Perhaps in makefuncs.c instead of vacuum.c? That's usually the place
>> used for node constructions like that.
>
> This function is moved in v11.
>
> On 8/30/17, 6:52 PM, "Michael Paquier"  wrote:
>> On Thu, Aug 31, 2017 at 8:35 AM, David G. Johnston
>>  wrote:
>>> Inspired by the syntax documentation for EXPLAIN:
>>>
>>> VACUUM [ ( option [, ...] ) ] [ table_def [, ...] ]
>>>
>>> where option can be one of:
>>> FULL
>>> FREEZE
>>> VERBOSE
>>> DISABLE_PAGE_SKIPPING
>>>
>>> and where table_def is:
>>> table_name [ ( column_name [, ... ] ) ]
>>
>> Yes, splitting things would be nice with the column list. I need more coffee.
>
> I've made this change in v11 as well.
>
> v2 of the de-duplication patch seems to still apply cleanly, so I haven't
> made any further changes to it.
>

I reviewed these patches and found a issue.

autovacuum worker seems not to work fine. I got an error message;

ERROR:  unrecognized node type: 0
CONTEXT:  automatic analyze of table "postgres.public.hoge"

I think we should set T_RangeVar to rangevar.type in
autovacuum_do_vac_analyze function.

Also, there is a small typo in dedupe_vacuum_relations_v2.patch.

+   /* if already procesed or not equal, skip */
+   if (list_member_int(duplicates, i) ||
relation->oid != nth_rel->oid)
+   continue;

s/procesed/processed/g

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-08-31 Thread Amit Khandekar
The last updated patch needs a rebase. Attached is the rebased version.

Thanks
-Amit Khandekar


ParallelAppend_v13_rebased_3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-31 Thread Ashutosh Bapat
On Thu, Aug 31, 2017 at 1:15 AM, Robert Haas  wrote:
> On Wed, Aug 30, 2017 at 12:47 PM, Ashutosh Bapat
>  wrote:
>> +1. I think we should just pull out the OIDs from partition descriptor.
>
> Like this?  The first patch refactors the expansion of a single child
> out into a separate function, and the second patch implements EIBO on
> top of it.
>
> I realized while doing this that we really want to expand the
> partitioning hierarchy depth-first, not breadth-first.  For some
> things, like partition-wise join in the case where all bounds match
> exactly, we really only need a *predictable* ordering that will be the
> same for two equi-partitioned table.

+1. Spotted right!

> A breadth-first expansion will
> give us that.  But it's not actually in bound order.  For example:
>
> create table foo (a int, b text) partition by list (a);
> create table foo1 partition of foo for values in (2);
> create table foo2 partition of foo for values in (1) partition by range (b);
> create table foo2a partition of foo2 for values from ('b') to ('c');
> create table foo2b partition of foo2 for values from ('a') to ('b');
> create table foo3 partition of foo for values in (3);
>
> The correct bound-order expansion of this is foo2b - foo2a - foo1 -
> foo3, which is indeed what you get with the attached patch.  But if we
> did the expansion in breadth-first fashion, we'd get foo1 - foo3 -
> foo2a, foo2b, which is, well, not in bound order.  If the idea is that
> you see a > 2 and rule out all partitions that appear before the first
> one with an a-value >= 2, it's not going to work.

Here are the patches revised a bit. I have esp changed the variable
names and arguments to reflect their true role in the functions. Also
updated prologue of expand_single_inheritance_child() to mention
"has_child". Let me know if those changes look good.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
From ed494bff369e43ff92128b9bd9c553eb19dffdc6 Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Wed, 30 Aug 2017 12:53:43 -0400
Subject: [PATCH 1/2] expand_single_inheritance_child by Robert

with

My changes to Rename arguments to and variables in
expand_single_inheritance_child() in accordance to their usage in the
function.
---
 src/backend/optimizer/prep/prepunion.c |  225 ++--
 1 file changed, 127 insertions(+), 98 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index e73c819..bb8f1ce 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -100,6 +100,12 @@ static List *generate_append_tlist(List *colTypes, List *colCollations,
 static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 static void expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte,
 		 Index rti);
+static void expand_single_inheritance_child(PlannerInfo *root,
+RangeTblEntry *parentrte,
+Index parentRTindex, Relation parentrel,
+PlanRowMark *parentrc, Relation childrel,
+bool *has_child, List **appinfos,
+List **partitioned_child_rels);
 static void make_inh_translation_list(Relation oldrelation,
 		  Relation newrelation,
 		  Index newvarno,
@@ -1459,9 +1465,6 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 	{
 		Oid			childOID = lfirst_oid(l);
 		Relation	newrelation;
-		RangeTblEntry *childrte;
-		Index		childRTindex;
-		AppendRelInfo *appinfo;
 
 		/* Open rel if needed; we already have required locks */
 		if (childOID != parentOID)
@@ -1481,101 +1484,10 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 			continue;
 		}
 
-		/*
-		 * Build an RTE for the child, and attach to query's rangetable list.
-		 * We copy most fields of the parent's RTE, but replace relation OID
-		 * and relkind, and set inh = false.  Also, set requiredPerms to zero
-		 * since all required permissions checks are done on the original RTE.
-		 * Likewise, set the child's securityQuals to empty, because we only
-		 * want to apply the parent's RLS conditions regardless of what RLS
-		 * properties individual children may have.  (This is an intentional
-		 * choice to make inherited RLS work like regular permissions checks.)
-		 * The parent securityQuals will be propagated to children along with
-		 * other base restriction clauses, so we don't need to do it here.
-		 */
-		childrte = copyObject(rte);
-		childrte->relid = childOID;
-		childrte->relkind = newrelation->rd_rel->relkind;
-		childrte->inh = false;
-		childrte->requiredPerms = 0;
-		childrte->securityQuals = NIL;
-		parse->rtable = lappend(parse->rtable, childrte);
-		childRTindex = list_length(parse->rtable);
-
-		/*
-		 * Build an AppendRelInfo for this parent and child, unless the child
-		 * is a partitioned table.
-		 */
-		if 

Re: [HACKERS] UPDATE of partition key

2017-08-31 Thread Dilip Kumar
On Fri, Aug 11, 2017 at 10:44 AM, Amit Khandekar  wrote:
> On 4 August 2017 at 22:28, Amit Khandekar  wrote:
>>>

I am planning to review and test this patch, Seems like this patch
needs to be rebased.

[dilip@localhost postgresql]$ patch -p1 <
../patches/update-partition-key_v15.patch
patching file doc/src/sgml/ddl.sgml
patching file doc/src/sgml/ref/update.sgml
patching file doc/src/sgml/trigger.sgml
patching file src/backend/catalog/partition.c
Hunk #3 succeeded at 910 (offset -1 lines).
Hunk #4 succeeded at 924 (offset -1 lines).
Hunk #5 succeeded at 934 (offset -1 lines).
Hunk #6 succeeded at 994 (offset -1 lines).
Hunk #7 succeeded at 1009 with fuzz 1 (offset 3 lines).
Hunk #8 FAILED at 1023.
Hunk #9 succeeded at 1059 with fuzz 2 (offset 10 lines).
Hunk #10 succeeded at 2069 (offset 2 lines).
Hunk #11 succeeded at 2406 (offset 2 lines).
1 out of 11 hunks FAILED -- saving rejects to file
src/backend/catalog/partition.c.rej
patching file src/backend/commands/copy.c
Hunk #2 FAILED at 1426.
Hunk #3 FAILED at 1462.
Hunk #4 succeeded at 2616 (offset 7 lines).
Hunk #5 succeeded at 2726 (offset 8 lines).
Hunk #6 succeeded at 2846 (offset 8 lines).
2 out of 6 hunks FAILED -- saving rejects to file
src/backend/commands/copy.c.rej
patching file src/backend/commands/trigger.c
Hunk #4 succeeded at 5261 with fuzz 2.
patching file src/backend/executor/execMain.c
Hunk #1 succeeded at 65 (offset 1 line).
Hunk #2 succeeded at 103 (offset 1 line).
Hunk #3 succeeded at 1829 (offset 20 lines).
Hunk #4 succeeded at 1860 (offset 20 lines).
Hunk #5 succeeded at 1927 (offset 20 lines).
Hunk #6 succeeded at 2044 (offset 21 lines).
Hunk #7 FAILED at 3210.
Hunk #8 FAILED at 3244.
Hunk #9 succeeded at 3289 (offset 26 lines).
Hunk #10 FAILED at 3340.
Hunk #11 succeeded at 3387 (offset 29 lines).
Hunk #12 succeeded at 3424 (offset 29 lines).
3 out of 12 hunks FAILED -- saving rejects to file
src/backend/executor/execMain.c.rej
patching file src/backend/executor/execReplication.c
patching file src/backend/executor/nodeModifyTable.c

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

2017-08-31 Thread Dilip Kumar
On Wed, Aug 30, 2017 at 2:00 AM, Robert Haas  wrote:

>
> I suggest defining a TBM_FILLFACTOR constant instead of repeating the
> value 0.9 in a bunch of places.  I think it would also be good to try
> to find the sweet spot for that constant.  Making it bigger reduces
> the number of lossy entries  created, but making it smaller reduces
> the number of times we have to walk the bitmap.  So if, for example,
> 0.75 is sufficient to produce almost all of the gain, then I think we
> would want to prefer 0.75 to 0.9.  But if 0.9 is better, then we can
> stick with that.
>
> Note that a value higher than 0.9375 wouldn't be sane without some
> additional safety precautions because maxentries could be as low as
> 16.

Thanks for the feedback.  I will work on it.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hooks to track changed pages for backup purposes

2017-08-31 Thread Andrey Borodin
Hi hackers!

Here is the patch with hooks that I consider sufficient for implementation of 
incremental backup with pages tracking as extension.

Recently I was posting these things to the thread "Adding hook in BufferSync 
for backup purposes" [0], but here I start separate thread since Subj field of 
previous discussion is technically wrong.

Currently various incremental backups can use one of this methods to take diff 
of a cluster since some LSN:
1. Check LSN of every page
2. Scan WAL and collect block numbers of changed pages

I propose adding hooks:
1. When a buffer is registered in WAL insertion
This hook is supposed to place blocknumbers in a temporary storage, like 
backend-local static array.
2. When a WAL record insertion is started and finished, to transfer 
blocknumbers to more concurrency-protected storage.
3. When the WAL segment is switched to initiate async transfer of accumulated 
blocknumbers to durable storage.

When we have accumulated diff blocknumbers for most of segments we can 
significantly speed up method of WAL scanning. If we have blocknumbers for all 
segments we can skip WAL scanning at all.

I think that these proposed hooks can enable more efficient backups. How do you 
think?

Any ideas will be appreciated. This patch is influenced by the code of PTRACK 
(Yury Zhuravlev and Postgres Professional).

Best regards, Andrey Borodin.
 

[0] 
https://www.postgresql.org/message-id/flat/20051502087457%40webcorp01e.yandex-team.ru#20051502087...@webcorp01e.yandex-team.ru



0001-hooks-to-watch-for-changed-pages.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers