Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Michael Paquier
On Thu, Jan 19, 2017 at 7:27 AM, Alvaro Herrera
 wrote:
> I thought this part was odd -- I mean, why is SysLogger_Start() being
> called if the collector is not enabled?  Turns out we do it and return
> early if not enabled.  But not in all cases -- there is one callsite in
> postmaster.c that avoids the call if the collector is disabled.  That
> needs to be changed if we want this to work reliably.

Indeed. And actually it is fine to remove the call to FreeFile() in
the error code path of pg_current_logfile().

> I don't think the "abstract names" stuff is an improvement (just look at
> the quoting mess in ConfigureNamesString).  I think we should do without
> that; at least as part of this patch.  If you think there's code that
> can get better because of the idea, let's see it.

Agreed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-18 Thread Michael Paquier
On Thu, Jan 19, 2017 at 4:25 PM, Haribabu Kommi
 wrote:
> Added the cleanup mechanism. But the tokenize_file() function call
> present in many places, But in one flow still it is possible to have
> file descriptor leak because of pg_hba_rules view. Because of this
> reason, added the cleanup everywhere.

Oops, sorry. Actually you don't need that. AllocateFile() registers
the fd opened with the sub-transactions it is involved with... So if
there is an ERROR nothing leaks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-18 Thread Haribabu Kommi
On Thu, Jan 19, 2017 at 4:08 PM, Michael Paquier 
wrote:

> On Wed, Jan 18, 2017 at 4:11 PM, Haribabu Kommi
>  wrote:
> > updated patch attached.
>
> Thanks for the new version.
>
> > Added tap tests patch also attached.
>
> This begins to look really nice. I am having fun torturing it :)
>

Thanks for the review.

Here are I think my last comments:
>
> +   linecxt = tokenize_file(HbaFileName, file, &hba_lines,
> &hba_line_nums, &hba_raw_lines);
> +   FreeFile(file);
> tokenize_file can leave on ERROR, in which case the file descriptor
> would leak. You much likely need a
> PG_END_ENSURE_ERROR_CLEANUP/PG_ENSURE_ERROR_CLEANUP block here with a
> callback to FreeFile() if an error is caught.
>

Added the cleanup mechanism. But the tokenize_file() function call
present in many places, But in one flow still it is possible to have
file descriptor leak because of pg_hba_rules view. Because of this
reason, added the cleanup everywhere.


> + 
> +  ADDRESS specifies the set of hosts the record matches.
> +  It can be a host name, or it is made up of an IP address
> +  or keywords such as (all,
> +  samehost and samenet).
> + 
> Why is that upper-case?
>

Corrected.

+ 
> +  If not null, an error message indicating why this
> +  rule could not be loaded
> + 
> Need a dot here, that's a sentence.
>

updated.

src/test/regress/expected/rules.out needs to be refreshed, regression
> tests are failing.
>

Corrected.

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_rules_9.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Grouped base relation

2017-01-18 Thread Tomas Vondra

On 01/19/2017 04:09 AM, Ashutosh Bapat wrote:

On Thu, Jan 19, 2017 at 12:02 AM, Robert Haas  wrote:

On Tue, Jan 17, 2017 at 11:33 PM, Ashutosh Bapat


Also, the point is naming that kind of function as aggtransmultifn
would mean that it's always supposed to multiply, which isn't true for
all aggregates.


TransValue * integer = newTransValue is well-defined for any
aggregate.  It's the result of aggregating that TransValue with itself
a number of times defined by the integer.  And that might well be
significantly faster than using aggcombinefn many times.  On the other
hand, how many queries just sit there are re-aggregate the same
TransValues over and over again?  I am having trouble wrapping my head
around that part of this.


Not all aggregates have TransValue * integer = newTransValue
behaviour. Example is array_agg() or string_agg() has "TransValue
concatenated integer time" behaviour. Or an aggregate "multiplying"
values across rows will have TransValue (raised to) integer behaviour.
Labelling all of those as "multi" doesn't look good.



All aggregates that have (or can have) a combine function have it, 
because in the worst case you can simply implement it by calling the 
combine function repeatedly.


Also, if you treat the combine function as "+" then the "multiply" 
function is exactly what "*" is expected to do. So I find the naming 
quite appropriate, actually.


But I think naming of the function is not the most important aspect of 
the patch, I believe. In the worst case, we can do s/multi/whatever/ 
sometime later.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Causal reads take II

2017-01-18 Thread Ants Aasma
On Tue, Jan 3, 2017 at 3:43 AM, Thomas Munro
 wrote:
> Here is a new version of my "causal reads" patch (see the earlier
> thread from the 9.6 development cycle[1]), which provides a way to
> avoid stale reads when load balancing with streaming replication.

Thanks for working on this. It will let us do something a lot of
people have been asking for.

> Long term, I think it would be pretty cool if we could develop a set
> of features that give you distributed sequential consistency on top of
> streaming replication.  Something like (this | causality-tokens) +
> SERIALIZABLE-DEFERRABLE-on-standbys[3] +
> distributed-dirty-read-prevention[4].

Is it necessary that causal writes wait for replication before making
the transaction visible on the master? I'm asking because the per tx
variable wait time between logging commit record and making
transaction visible makes it really hard to provide matching
visibility order on master and standby. In CSN based snapshot
discussions we came to the conclusion that to make standby visibility
order match master while still allowing for async transactions to
become visible before they are durable we need to make the commit
sequence a vector clock and transmit extra visibility ordering
information to standby's. Having one more level of delay between wal
logging of commit and making it visible would make the problem even
worse.

One other thing that might be an issue for some users is that this
patch only ensures that clients observe forwards progress of database
state after a writing transaction. With two consecutive read only
transactions that go to different servers a client could still observe
database state going backwards. It seems that fixing that would
require either keeping some per client state or a global agreement on
what snapshots are safe to provide, both of which you tried to avoid
for this feature.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-01-18 Thread vinayak


On 2017/01/16 17:35, Masahiko Sawada wrote:

On Fri, Jan 13, 2017 at 3:48 PM, Masahiko Sawada  wrote:

On Fri, Jan 13, 2017 at 3:20 PM, Ashutosh Bapat
 wrote:

Long time passed since original patch proposed by Ashutosh, so I
explain again about current design and functionality of this feature.
If you have any question, please feel free to ask.

Thanks for the summary.


Parameters
==

[ snip ]


Cluster-wide atomic commit
===
Since the distributed transaction commit on foreign servers are
executed independently, the transaction that modified data on the
multiple foreign servers is not ensured that transaction did either
all of them commit or all of them rollback. The patch adds the
functionality that guarantees distributed transaction did either
commit or rollback on all foreign servers. IOW the goal of this patch
is achieving the cluster-wide atomic commit across foreign server that
is capable two phase commit protocol.

In [1], I proposed that we solve the problem of supporting PREPARED
transactions involving foreign servers and in subsequent mail Vinayak
agreed to that. But this goal has wider scope than that proposal. I am
fine widening the scope, but then it would again lead to the same
discussion we had about the big picture. May be you want to share
design (or point out the parts of this design that will help) for
solving smaller problem and tone down the patch for the same.


Sorry for confuse you. I'm still focusing on solving only that
problem. What I was trying to say is that I think that supporting
PREPARED transaction involving foreign server is the means, not the
end. So once we supports PREPARED transaction involving foreign
servers we can achieve cluster-wide atomic commit in a sense.


Attached updated patches. I fixed some bugs and add 003 patch that
adds TAP test for foreign transaction.
003 patch depends 000 and 001 patch.

Please give me feedback.


I have tested prepared transactions with foreign servers but after 
preparing the transaction

the following error occur infinitely.
Test:
=
=#BEGIN;
=#INSERT INTO ft1_lt VALUES (10);
=#INSERT INTO ft2_lt VALUES (20);
=#PREPARE TRANSACTION 'prep_xact_with_fdw';

2017-01-18 15:09:48.378 JST [4312] ERROR:  function pg_fdw_resolve() 
does not exist at character 8
2017-01-18 15:09:48.378 JST [4312] HINT:  No function matches the given 
name and argument types. You might need to add explicit type casts.

2017-01-18 15:09:48.378 JST [4312] QUERY:  SELECT pg_fdw_resolve()
2017-01-18 15:09:48.378 JST [29224] LOG:  worker process: foreign 
transaction resolver (dbid 13119) (PID 4312) exited with exit code 1

.

If we check the status on another session then it showing the status as 
prepared.

=# select * from pg_fdw_xacts;
 dbid  | transaction | serverid | userid |  status  | identifier
---+-+--++--+ 

 13119 |1688 |16388 | 10 | prepared | 
px_2102366504_16388_10
 13119 |1688 |16389 | 10 | prepared | 
px_749056984_16389_10

(2 rows)

I think this is a bug.

Regards,
Vinayak Pokale
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.

2017-01-18 Thread Ashutosh Sharma
> However, I've some minor comments on the patch:
>
> +/*
> + * HASH_ALLOCATABLE_PAGE_SZ represents allocatable
> + * space (pd_upper - pd_lower) on a hash page.
> + */
> +#define HASH_ALLOCATABLE_PAGE_SZ \
> +   BLCKSZ - \
> +   (SizeOfPageHeaderData + sizeof(HashPageOpaqueData))
> My suggestion will be not to write "(pd_upper - pd_lower)" in the
> comment. You may write allocatable space on a empty hash page.

Accepted. Have changed the comment accordingly.

>
> +   buf = ReadBufferExtended(rel, MAIN_FORKNUM, blkno,
> RBM_NORMAL, NULL);
> Use BAS_BULKREAD strategy to read the buffer.
>

okay, corrected. Please check the attached v3 patch with corrections.

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com


0001-Add-pgstathashindex-to-pgstattuple-extension-v3.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review: GIN non-intrusive vacuum of posting tree

2017-01-18 Thread Jeff Davis
https://www.postgresql.org/message-id/CAJEAwVE4UAmm8fr%2BNW8XTnKV6M--ACoNhL3ES8yoKL2sKhbaiw%40mail.gmail.com

Let me re-summarize what's been done here to make sure I understand:

Each key in GIN has its own posting tree, which is itself a btree,
holding all of the tuples that contain that key. This posting tree is
really just a set of tuples, and searches always scan an entire
posting tree (because all the tuples contain the key, so all are a
potential match).

Currently, the cleanup process requires blocking all reads and writes
in the posting list. I assume this is only a problem when a key is
common enough that its posting list is quite large (how large?).

This patch makes a couple changes to improve concurrency:

1. Vacuum leaves first, without removing any pages. Instead, just keep
track of pages which are completely empty (more generally, subtrees
that contain empty pages).

2. Free the empty pages in those subtrees. That requires blocking
reads and writes to that subtree, but not the entire posting list.
Blocking writes is easy: acquiring a cleanup lock on the root of any
subtree always blocks any writes to that subtree, because the write
keeps a pin on all pages in that path. Blocking reads is accomplished
by locking the page to be deleted, then locking/unlocking the page one
left of that (which may be outside the subtree?).


Maybe a basic question, but why is the posting list a btree at all,
and not, say a doubly-linked list?

Review of the code itself:

* Nice and simple, with a net line count of only +45.
* Remove commented-out code.
* In the README, "leaf-to-right" should be left-to-right; and "takinf"
should be "taking".
* When you say the leftmost page is never deleted; do you mean the
leftmost of the entire posting tree, or leftmost of the subtree that
you are removing pages from?
* In order to keep out concurrent reads, you need to lock/unlock the
left page while holding exclusive lock on the page being deleted, but
I didn't see how that happens exactly in the code. Where does that
happen?

Regards,
 Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2017-01-18 Thread Noah Misch
On Wed, Jan 18, 2017 at 02:30:38PM +0900, Michael Paquier wrote:
> On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch  wrote:
> > The latest versions document this precisely, but I agree with Peter's 
> > concern
> > about plain "scram".  Suppose it's 2025 and PostgreSQL support SASL 
> > mechanisms
> > OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512.  What
> > should the pg_hba.conf options look like at that time?  I don't think 
> > having a
> > single "scram" option fits in such a world.
> 
> Sure.
> 
> > I see two strategies that fit:
> >
> > 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the
> >mechanisms to offer.
> > 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc.
> 
> Or we could have a sasl option, with a mandatory array of mechanisms
> to define one or more items, so method entries in pg_hba.conf would
> look llke that:
> sasl mechanism=scram_sha_256,scram_sha3_512

I like that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Alvaro Herrera
Karl O. Pinc wrote:

> @@ -511,10 +519,16 @@ int
>  SysLogger_Start(void)
>  {
>   pid_t   sysloggerPid;
> - char   *filename;
>  
> + /*
> +  * Logging collector is not enabled. We don't know where messages are
> +  * logged.  Remove outdated file holding the current log filenames.
> +  */
>   if (!Logging_collector)
> + {
> + unlink(LOG_METAINFO_DATAFILE);
>   return 0;
> + }

I thought this part was odd -- I mean, why is SysLogger_Start() being
called if the collector is not enabled?  Turns out we do it and return
early if not enabled.  But not in all cases -- there is one callsite in
postmaster.c that avoids the call if the collector is disabled.  That
needs to be changed if we want this to work reliably.

I don't think the "abstract names" stuff is an improvement (just look at
the quoting mess in ConfigureNamesString).  I think we should do without
that; at least as part of this patch.  If you think there's code that
can get better because of the idea, let's see it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2017-01-18 Thread Amit Langote
On 2017/01/19 5:29, Robert Haas wrote:
> On Wed, Jan 18, 2017 at 3:12 PM, Robert Haas  wrote:
>> On Tue, Jan 10, 2017 at 6:06 AM, Amit Langote
>>  wrote:
>>> [ updated patches ]
>>
>> I committed 0004 and also fixed the related regression test not to
>> rely on DROP .. CASCADE, which isn't always stable.  The remainder of

Thanks!

>> this patch set needs a rebase, and perhaps you could also fold in
>> other pending partitioning fixes so I have everything to look at it in
>> one place.
> 
> Just to be a little more clear, I don't mind multiple threads each
> with a patch or patch set so much, but multiple patch sets on the same
> thread gets hard for me to track.  Sorry for the inconvenience.

OK, I agree that having multiple patch sets on the same thread is cumbersome.

So, here are all the patches I posted to date (and one new at the bottom)
for reported and unreported bugs, excluding the one involving
BulkInsertState for which you replied in a new thread.

I'll describe the attached patches in brief:

0001-Fix-a-bug-of-insertion-into-an-internal-partition.patch

Since implicit partition constraints are not inherited, an internal
partition's constraint was not being enforced when targeted directly.
So, include such constraint when setting up leaf partition result
relations for tuple-routing.

0002-Set-ecxt_scantuple-correctly-for-tuple-routing.patch

In 2ac3ef7a01df859c62d0a02333b646d65eaec5ff, we changed things so that
it's possible for a different TupleTableSlot to be used for partitioned
tables at successively lower levels.  If we do end up changing the slot
from the original, we must update ecxt_scantuple to point to the new one
for partition key of the tuple to be computed correctly.

Last posted here:
https://www.postgresql.org/message-id/99fbfad6-b89b-9427-a6ca-197aad98c48e%40lab.ntt.co.jp

0003-Fix-RETURNING-to-work-correctly-after-tuple-routing.patch

In ExecInsert(), do not switch back to the root partitioned table
ResultRelInfo until after we finish ExecProcessReturning(), so that
RETURNING projection is done using the partition's descriptor.  For
the projection to work correctly, we must initialize the same for
each leaf partition during ModifyTableState initialization.

0004-Fix-some-issues-with-views-and-partitioned-tables.patch

Automatically updatable views failed to handle partitioned tables.
Once that's fixed, WITH CHECK OPTIONS wouldn't work correctly without
the WCO expressions having been suitably converted for each partition
(think applying map_partition_varattnos to Vars in the WCO expressions
just like with partition constraint expressions).

0005-Fix-some-wrong-thinking-in-check_new_partition_bound.patch

Because a given range bound in the PartitionBoundInfo.datums array
is sometimes a range lower bound and upper bound at other times, we
must be careful when assuming which, especially when interpreting
the result of partition_bound_bsearch which returns the index of the
greatest bound that is less than or equal to probe.  Due to an error
in thinking about the same, the relevant code in
check_new_partition_bound() caused invalid partition (index = -1)
to be chosen as the partition being overlapped.

Last posted here:
https://www.postgresql.org/message-id/603acb8b-5dec-31e8-29b0-609a68aac50f%40lab.ntt.co.jp

0006-Avoid-tuple-coversion-in-common-partitioning-cases.patch

Currently, the tuple conversion is performed after a tuple is routed,
even if the attributes of a target leaf partition map one-to-one with
those of the root table, which is wasteful.  Avoid that by making
convert_tuples_by_name() return a NULL map for such cases.

0007-Avoid-code-duplication-in-map_partition_varattnos.patch

Code to map attribute numbers in map_partition_varattnos() duplicates
what convert_tuples_by_name_map() does.  Avoid that as pointed out by
Álvaro Herrera.

Last posted here:
https://www.postgresql.org/message-id/9ce97382-54c8-deb3-9ee9-a2ec271d866b%40lab.ntt.co.jp

0008-Avoid-DROP-TABLE-.-CASCADE-in-more-partitioning-test.patch

This is the new one.  There were quite a few commits recently to fix the
breakage in regression tests due to not using ORDER BY in queries on
system catalogs and using DROP TABLE ... CASCADE.  There were still some
instances of the latter in create_table.sql and alter_table.sql.

Thanks,
Amit
>From 37c861d5eae1c8f11b3fae93cb209da262a13fbc Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 13 Dec 2016 15:07:41 +0900
Subject: [PATCH 1/8] Fix a bug of insertion into an internal partition.

Since implicit partition constraints are not inherited, an internal
partition's constraint was not being enforced when targeted directly.
So, include such constraint when setting up leaf partition result
relations for tuple-routing.

Reported by: n/a
Patch by: Amit Langote
Reports: n/a
---
 src/backend/commands/copy.c  |  1 -
 src/backend/commands/tablecmds.c |  1 -
 src/backend/executor/execMain.c  | 42 
 src/include/executor/executor.h 

Re: [HACKERS] pg_hba_file_settings view patch

2017-01-18 Thread Michael Paquier
On Wed, Jan 18, 2017 at 4:11 PM, Haribabu Kommi
 wrote:
> updated patch attached.

Thanks for the new version.

> Added tap tests patch also attached.

This begins to look really nice. I am having fun torturing it :)

Here are I think my last comments:

+   linecxt = tokenize_file(HbaFileName, file, &hba_lines,
&hba_line_nums, &hba_raw_lines);
+   FreeFile(file);
tokenize_file can leave on ERROR, in which case the file descriptor
would leak. You much likely need a
PG_END_ENSURE_ERROR_CLEANUP/PG_ENSURE_ERROR_CLEANUP block here with a
callback to FreeFile() if an error is caught.

+ 
+  ADDRESS specifies the set of hosts the record matches.
+  It can be a host name, or it is made up of an IP address
+  or keywords such as (all,
+  samehost and samenet).
+ 
Why is that upper-case?

+ 
+  If not null, an error message indicating why this
+  rule could not be loaded
+ 
Need a dot here, that's a sentence.

src/test/regress/expected/rules.out needs to be refreshed, regression
tests are failing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2017-01-18 Thread Ashutosh Bapat
On Thu, Jan 19, 2017 at 2:14 AM, Robert Haas  wrote:
> On Fri, Jan 13, 2017 at 6:22 AM, Etsuro Fujita
>  wrote:
>> My biggest concern about GetExistingLocalJoinPath is that might not be
>> extendable to the case of foreign-join paths with parameterization; in which
>> case, fdw_outerpath for a given foreign-join path would need to have the
>> same parameterization as the foreign-join path, but there might not be any
>> existing paths with the same parameterization in the path list.
>
> I agree that this is a problem.

Effectively, it means that foreign join path creation will have a
parameterization different (per add_path()) from any local join
produced. But why would it be so? The parameterization bubbles up from
the base relation. The process for creating parameterized local and
foreign paths for a base relation is same. Thus we will have same
parameterizations considered for foreign as well as local joins. Those
different parameterizations will be retained add_path(), so we should
find one there or should be able to expand an existing
parameterization by reparameterization.

Even if such a case exists, the same problem exists while searching
paths from the joining relations. If the join doesn't have a local
path with required parameterization, so would be the joining
relations, parameterized paths from which are used to build
parameterized paths for join.

>
>> You might
>> think we could get the fdw_outerpath by getting an existing path with no
>> parameterization as in GetExistingLocalJoinPath and then modifying the
>> path's param_info to match the parameterization of the foreign-join path.  I
>> don't know that really works, but that might be inefficient.
>
> I am not sure about this.
>
>> What I have in mind to support foreign-join paths with parameterization for
>> postgres_fdw like this: (1) generate parameterized paths from any joinable
>> combination of the outer/inner cheapest-parameterized paths that have pushed
>> down the outer/inner relation to the remote server, in a similar way as
>> postgresGetForeignJoinPaths creates unparameterized paths, and (2) create
>> fdw_outerpath for each parameterized path from the outer/inner paths used to
>> generate the parameterized path, by create_nestloop_path (or,
>> create_hashjoin_path or create_mergejoin_path if full join), so that the
>> resulting fdw_outerpath has the same parameterization as the paramterized
>> path.  This would probably work and might be more efficient.  And the patch
>> I proposed would be easily extended to this, by replacing the outer/inner
>> cheapest-total paths with the outer/inner cheapest-parameterized paths.
>> Attached is the latest version of the patch.
>
> Yes, I think that's broadly the approach Tom was recommending.

I don't have problem with that approach, but the latest patch has
following problems.
1. We are copying chunks of path creation logic, instead of reusing
corresponding functions.
2. There are many cases where the new function would return no local
path and hence postgres_fdw doesn't push down the join [1]. This will
be performance regression compared to 9.6.

[1] https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg302463.html
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-01-18 Thread Haribabu Kommi
On Wed, Jan 18, 2017 at 6:55 PM, Rahila Syed  wrote:

> >+ /* Check if the scan for current scan keys is finished */
> >+ if (so->arrayKeyCount < btscan->btps_arrayKeyCount)
> >+ *status = false;
>
> >I didn't clearly understand, in which scenario the arrayKeyCount is less
> >than btps_arrayKeyCount?
> Consider following array scan keys
>
> select * from test2 where j=ANY(ARRAY[1000,2000,3000]);
>
> By the time the current worker has finished reading heap tuples
> corresponding
> to array key 1000(arrayKeyCount = 0), some other worker might have
> advanced the scan to the
> array key 2000(btps_arrayKeyCount =1). In this case when the current
> worker fetches next page to scan,
> it must advance its scan keys before scanning the next page of parallel
> scan.
> I hope this helps.
>

Thanks for the details.
One worker incremented arrayKeyCount and btps_arrayKeyCount both. As
btps_arrayKeyCount present in shared memory, so the other worker see the
update
and hits above the condition.


> >+BlockNumber
> >+_bt_parallel_seize(IndexScanDesc scan, bool *status)
>
> >The return value of the above function is validated only in _bt_first
> >function, but in other cases it is not.
> In other cases it is validated in _bt_readnextpage() which is called after
> _bt_parallel_seize().
>
> >From the function description,
> >it is possible to return P_NONE for the workers also with status as
> >true. I feel it is better to handle the P_NONE case internally only
> >so that callers just check for the status. Am i missing anything?
>
> In case of the next block being P_NONE and status true, the code
> calls _bt_parallel_done() to notify other workers followed by
> BTScanPosInvalidate(). Similar check for block = P_NONE also
> happens in existing code. See following in _bt_readnextpage(),
>
>
> if (blkno == P_NONE || !so->currPos.moreRight)
> {
>_bt_parallel_done(scan);
> BTScanPosInvalidate(so->currPos);
> return false;
> }
> So to keep it consistent with the existing code, the check
> is kept outside _bt_parallel_seize().
>

Thanks. Got it.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] PoC: Grouped base relation

2017-01-18 Thread Ashutosh Bapat
On Thu, Jan 19, 2017 at 12:02 AM, Robert Haas  wrote:
> On Tue, Jan 17, 2017 at 11:33 PM, Ashutosh Bapat
>  wrote:
>> I don't think aggcombinefn isn't there because we couldn't write it
>> for array_agg() or string_agg(). I guess, it won't be efficient to
>> have those aggregates combined across parallel workers.
>
> I think there are many cases where it would work fine.  I assume that
> David just didn't make it a priority to write those functions because
> it wasn't important to the queries he wanted to optimize.  But
> somebody can submit a patch for it any time and I bet it will have
> practical use cases.  There might be some performance problems shoving
> large numbers of lengthy values through a shm_mq, but we won't know
> that until somebody tries it.
>
>> Also, the point is naming that kind of function as aggtransmultifn
>> would mean that it's always supposed to multiply, which isn't true for
>> all aggregates.
>
> TransValue * integer = newTransValue is well-defined for any
> aggregate.  It's the result of aggregating that TransValue with itself
> a number of times defined by the integer.  And that might well be
> significantly faster than using aggcombinefn many times.  On the other
> hand, how many queries just sit there are re-aggregate the same
> TransValues over and over again?  I am having trouble wrapping my head
> around that part of this.

Not all aggregates have TransValue * integer = newTransValue
behaviour. Example is array_agg() or string_agg() has "TransValue
concatenated integer time" behaviour. Or an aggregate "multiplying"
values across rows will have TransValue (raised to) integer behaviour.
Labelling all of those as "multi" doesn't look good.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Ryan Murphy
> The on-screen output isn't all that helpful for diagnosing what went
> wrong.  You might learn more by looking at the regression.diffs files.
> Remember that errors tend to cascade, so the first one(s) in any
> particular test suite are the most important --- the rest might just
> be fallout.
>
>
Thanks Tom, I'll check out the diffs.


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Michael Paquier
On Thu, Jan 19, 2017 at 6:56 AM, Karl O. Pinc  wrote:
> On Wed, 18 Jan 2017 15:08:09 -0600
> "Karl O. Pinc"  wrote:
>
>> I would like to see index entries for "current_logfiles"
>> so this stuff is findable.
>
> Attached is a v27 of the patch.
>
> I polished some of the sentences in the documentation
> and added index entries.
>
> Also attached are a series of 2 patches to apply on top
> of v27 which make symbols out of hardcoded constants.

+ 
+   current_logfiles
+   and the log_destination configuration parameter
+ 
I have been wondering about this portion a bit, and actually that's pretty nice.

So patch is marked as ready for committer, the version proposed here
not using SRFs per Gilles' input on the matter.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-18 Thread Haribabu Kommi
On Sat, Jan 14, 2017 at 6:28 PM, Kuntal Ghosh 
wrote:

> On Mon, Jan 9, 2017 at 1:45 PM, Haribabu Kommi 
> wrote:
> > Updated patch is attached.
> >
> I've a few comments about the patch.
>

Thanks for the review.

+ This type can accept both 6 and 8 bytes length MAC addresses.
> A 6 bytes length MAC address is internally converted to 8 bytes. We
> should include this in the docs. Because output is always 8 bytes.
> Otherwise, a user may be surprised with the output.
>

updated accordingly.


> +#define hibits(addr) \
> +  ((unsigned long)(((addr)->a<<24)|((addr)->b<<16)|((addr)->c<<8)|((addr)
> ->d)))
> +
> +#define lobits(addr) \
> +  ((unsigned long)(((addr)->e<<24)|((addr)->f<<16)|((addr)->g<<8)|((addr)
> ->h)))
> +
> There should be some spacing.
>

corrected.

+ if (!eight_byte_address)
> + {
> + d = 0xFF;
> + e = 0xFE;
> + }
> You already have count variable. Just check (count != 8).
>

Changed.


> + res *= 256 * 256;
> + res *= 256 * 256;
> Bit shift operator can be used for this. For example: (res << 32).
>

Changed by adding a typecast, because res is a double datatype value.


> -DATA(insert ( 403 macaddr_ops PGNSP PGUID 1984  829 t 0 ));
> -DATA(insert ( 405 macaddr_ops PGNSP PGUID 1985  829 t 0 ));
> +DATA(insert ( 403 macaddr_ops PGNSP PGUID 1984  829 t 0 ));
> +DATA(insert ( 405 macaddr_ops PGNSP PGUID 1985  829 t 0 ));
> This is unnecessary I guess.
>

Corrected.


> There was some whitespace error while applying the patch. Also, there
> are some spacing inconsistencies in the comments. I've not tested with
> this patch. I'll let you know once I'm done testing.
>

Corrected.

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia


mac_eui64_support_6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Tom Lane
Ryan Murphy  writes:
> So I ran "make -i installcheck-world" and it ran to completion.  This is on
> a freshly "git pull"ed postgres source tree.  Certain tests failed, but
> most succeeded.

Those results look pretty broken :-(

> There was no "overall" indication of success or failure at
> the very end, so I'm not sure what this means.

You told make to ignore errors (with "-i"), so it didn't notice that
anything was wrong.

The on-screen output isn't all that helpful for diagnosing what went
wrong.  You might learn more by looking at the regression.diffs files.
Remember that errors tend to cascade, so the first one(s) in any
particular test suite are the most important --- the rest might just
be fallout.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] smallint out of range EXECUTEing prepared statement

2017-01-18 Thread Andrew Gierth
> "Justin" == Justin Pryzby  writes:

 Justin> Is this expected behavior ?

 Justin> ts=# SELECT * FROM t WHERE site_id=32768 LIMIT 1;
 Justin> (0 rows)

 Justin> ts=# PREPARE x AS SELECT * FROM t WHERE site_id=$1 LIMIT 1;
 Justin> PREPARE
 Justin> ts=# EXECUTE x(32768);
 Justin> ERROR:  smallint out of range

If column "site_id" is of type smallint, then parse analysis will deduce
a type of smallint for $1, which is otherwise of unknown type. So the
prepared statement "x" then has one parameter of type smallint.

Passing 32768 for that parameter therefore fails with the expected error.

 Justin> ts=# PREPARE y AS SELECT * FROM t WHERE site_id::int=$1 LIMIT 1;
 Justin> PREPARE

Now $1 is of type integer, not smallint, because parse analysis sees
(integer = unknown) and deduces the type from that.

(a better way would be WHERE site_id = $1::integer, which would allow
index usage on site_id, unlike your example)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Ryan Murphy
>
> Jim Nasby said I shouldn't necessarily need to build the docs / the whole
> world in order to review patches.  But the Review form needs a `make
> installworld-check`.  Do I need to install the whole world in order to meet
> this requirement?  Happy to do so if required, but in that case, I wonder
> why 'osx' is having so much trouble parsing the SGML into XML?
>

Aha, I was able to run a "make -i install-world", which ignored the SGML
errors and built the rest of the world.  Now trying "make -i
installcheck-world".


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Ryan Murphy
>
>
> Ryan try to run 'make install-world' then 'make -i installcheck-world', -i
> option will ignore the error and proceed. You can check if any other tests
> fails. This is a separate issue, unrelated to this patch. I do not think
> we should stop from changing the status because of this.
>
>
Beena, when I try to run 'make install-world' I get a lot of errors from
the 'osx' executable which I think has to do with building the SGML
documentation:

make -C doc install
make -C src install
make -C sgml install
{ \
  echo ""; \
  echo ""; \
} > version.sgml
'/opt/local/bin/perl' ./mk_feature_tables.pl YES
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt > features-supported.sgml
'/opt/local/bin/perl' ./mk_feature_tables.pl NO
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml
'/opt/local/bin/perl' ./generate-errcodes-table.pl
../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
/usr/local/bin/osx -D. -x lower -i include-xslt-index postgres.sgml
>postgres.xmltmp
/usr/local/bin/osx:postgres.sgml:3:55:W: cannot generate system identifier
for public text "-//OASIS//DTD DocBook V4.2//EN"
/usr/local/bin/osx:postgres.sgml:12:0:E: reference to entity "BOOK" for
which no system identifier could be generated
/usr/local/bin/osx:postgres.sgml:3:0: entity was defined here
/usr/local/bin/osx:postgres.sgml:12:0:E: DTD did not contain element
declaration for document type name
/usr/local/bin/osx:postgres.sgml:14:9:E: there is no attribute "ID"
/usr/local/bin/osx:postgres.sgml:14:19:E: element "BOOK" undefined
/usr/local/bin/osx:postgres.sgml:15:7:E: element "TITLE" undefined
/usr/local/bin/osx:postgres.sgml:17:10:E: element "BOOKINFO" undefined
/usr/local/bin/osx:postgres.sgml:18:13:E: element "CORPAUTHOR" undefined
/usr/local/bin/osx:postgres.sgml:19:14:E: element "PRODUCTNAME" undefined
/usr/local/bin/osx:postgres.sgml:20:16:E: element "PRODUCTNUMBER" undefined
/usr/local/bin/osx:legal.sgml:3:5:E: element "DATE" undefined
/usr/local/bin/osx:legal.sgml:5:10:E: element "COPYRIGHT" undefined
/usr/local/bin/osx:legal.sgml:6:6:E: element "YEAR" undefined
...
/usr/local/bin/osx:history.sgml:173:13:E: element "LISTITEM" undefined
/usr/local/bin/osx:history.sgml:174:10:E: element "PARA" undefined
/usr/local/bin/osx:history.sgml:175:14:E: element "ACRONYM" undefined
/usr/local/bin/osx:I: maximum number of errors (200) reached; change with
-E option
make[3]: *** [postgres.xml] Error 1
make[2]: *** [install] Error 2
make[1]: *** [install] Error 2
make: *** [install-world-doc-recurse] Error 2


Jim Nasby said I shouldn't necessarily need to build the docs / the whole
world in order to review patches.  But the Review form needs a `make
installworld-check`.  Do I need to install the whole world in order to meet
this requirement?  Happy to do so if required, but in that case, I wonder
why 'osx' is having so much trouble parsing the SGML into XML?

Thanks for the help,
Ryan


Re: [HACKERS] [PATCH] ALTER DEFAULT PRIVILEGES with GRANT/REVOKE ON SCHEMAS

2017-01-18 Thread Stephen Frost
* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 10/01/17 17:33, Matheus de Oliveira wrote:
> > 
> > On Mon, Jan 9, 2017 at 10:58 AM, Ashutosh Sharma  > > wrote:
> > 
> > > Also, should I add translations for that error message in other 
> > languages (I
> > > can do that without help of tools for pt_BR) or is that a latter 
> > process in
> > > the releasing?
> > >
> > 
> > I think you should add it but i am not sure when it is done.
> > 
> > 
> > I'll ask one of the guys who work with pt_BR translations (I know him in
> > person).
> 
> Translations are not handled by patch author but by translation project
> so no need.
> 
> > 
> > Attached a rebased version and with the docs update pointed by Ashutosh
> > Sharma.
> > 
> 
> The patch looks good, the only thing I am missing is tab completion
> support for psql.

Awesome, glad to hear it.  This is also on my list of patches that I'm
planning to look at, just so folks know.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] ALTER DEFAULT PRIVILEGES with GRANT/REVOKE ON SCHEMAS

2017-01-18 Thread Petr Jelinek
On 10/01/17 17:33, Matheus de Oliveira wrote:
> 
> On Mon, Jan 9, 2017 at 10:58 AM, Ashutosh Sharma  > wrote:
> 
> > Also, should I add translations for that error message in other 
> languages (I
> > can do that without help of tools for pt_BR) or is that a latter 
> process in
> > the releasing?
> >
> 
> I think you should add it but i am not sure when it is done.
> 
> 
> I'll ask one of the guys who work with pt_BR translations (I know him in
> person).

Translations are not handled by patch author but by translation project
so no need.

> 
> Attached a rebased version and with the docs update pointed by Ashutosh
> Sharma.
> 

The patch looks good, the only thing I am missing is tab completion
support for psql.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Robert Haas  writes:
> So, one of the big reasons I use CASE is to avoid evaluating
> expressions in cases where they might throw an ERROR.  Like, you know:
> CASE WHEN d != 0 THEN n / d ELSE NULL END
> I guess it's not the end of the world if that only works for
> non-set-returning functions, but it's something to think about.

Well, refusing CASE-containing-SRF at all isn't going to make your
life any better in that regard :-(

It's possibly worth noting that this is also true for aggregates and
window functions: wrapping those in a CASE doesn't stop them from being
evaluated, either.  People seem to be generally used to that, although
I think I've seen one or two complaints about it from folks who seemed
unclear on the concept of aggregates.

In the end I think this is mostly about backwards compatibility:
are we sufficiently worried about that that we'd rather throw an
error than have a silent change of behavior?  TBH I'm not sure.
We've certainly got two other silent changes of behavior in this
same patch.  The argument for treating this one differently,
I think, is that it's changing from a less surprising behavior
to a more surprising one whereas the other changes are the reverse,
or at worst neutral.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 7:00 PM, Andres Freund  wrote:
>>So, one of the big reasons I use CASE is to avoid evaluating
>>expressions in cases where they might throw an ERROR.  Like, you know:
>>
>>CASE WHEN d != 0 THEN n / d ELSE NULL END
>>
>>I guess it's not the end of the world if that only works for
>>non-set-returning functions, but it's something to think about.
>
> That's already not reliable in a bunch of cases, particularly evaluation 
> during planning...  Not saying that's good, but it is.

Whee!

:-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Grouped base relation

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 5:14 PM, David Rowley
 wrote:
> On 19 January 2017 at 07:32, Robert Haas  wrote:
>> On Tue, Jan 17, 2017 at 11:33 PM, Ashutosh Bapat
>>  wrote:
>>> I don't think aggcombinefn isn't there because we couldn't write it
>>> for array_agg() or string_agg(). I guess, it won't be efficient to
>>> have those aggregates combined across parallel workers.
>>
>> I think there are many cases where it would work fine.  I assume that
>> David just didn't make it a priority to write those functions because
>> it wasn't important to the queries he wanted to optimize.  But
>> somebody can submit a patch for it any time and I bet it will have
>> practical use cases.  There might be some performance problems shoving
>> large numbers of lengthy values through a shm_mq, but we won't know
>> that until somebody tries it.
>
> I had assumed that the combine function which combines a large array
> or a large string would not be any cheaper than doing that
> incrementally with the transfn. Of course some of this would happen in
> parallel, but it still doubles up some of the memcpy()ing, so perhaps
> it would be slower? ... I didn't ever get a chance to test it.

Even if that particular bit is not very much faster, it might have the
advantage of letting other parts of the plan be parallelized, and you
can still win that way.  In the internal-to-EnterpriseDB experiments
we've been doing over the last few months, we've seen that kind of
thing a lot, and it informs a lot of the patches that my colleagues
have been submitting.  But I also wouldn't be surprised if there are
cases where it wins big even without that.  For example, if you're
doing an aggregate with lots of groups and good physical-to-logical
correlation, the normal case might be for all the rows in a group to
be on the same page.  So you parallel seq scan the table and have
hardly any need to run the combine function in the leader (but of
course you have to have it available just in case you do need it).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Thu, Jan 19, 2017 at 5:01 AM, Peter Eisentraut
>  wrote:
> > On 1/18/17 8:25 AM, Stephen Frost wrote:
> >> I was actually thinking about it the other way- start out by changing
> >> them to both be 5m and then document next to checkpoint_timeout (and
> >> max_wal_size, perhaps...) that if you go changing those parameters (eg:
> >> bumping up checkpoint_timeout to 30 minutes and max_wal_size up enough
> >> that you're still checkpointing based on time and not due to running out
> >> of WAL space) then you might need to consider raising the timeout for
> >> pg_ctl to wait around for the server to finish going through crash
> >> recovery due to all of the outstanding changes since the last
> >> checkpoint.
> >
> > It is important for users to be aware of this, but I don't think the
> > relationship between checkpoint_timeout and recovery time is linear, so
> > it's unclear what the exact advice should be.
> 
> This is a right assumption for steady workloads with few DDLs, but for
> example once a couple of CREATE DATABASE records are in such a law is
> broken.

I don't expect CREATE DATABASE to be terribly frequent, and it doesn't
actually change the rule that the checkpoint_timeout is a maximum.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Logical Replication WIP

2017-01-18 Thread Petr Jelinek
On 17/01/17 22:43, Robert Haas wrote:
> On Tue, Jan 17, 2017 at 11:15 AM, Petr Jelinek
>  wrote:
>>> Is there anything stopping anyone from implementing it?
>>
>> No, just didn't seem priority for the functionality right now.
> 
> Why is it OK for this to not support rename like everything else does?
>  It shouldn't be more than a few hours of work to fix that, and I
> think leaving stuff like that out just because it's a lower priority
> is fairly short-sighted.
> 

Sigh, I wanted to leave it for next CF, but since you insist. Here is a
patch that adds rename.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From 3ea2e0027bdeab5d6877ac7c18c28e12c7406b76 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Thu, 19 Jan 2017 00:59:01 +0100
Subject: [PATCH 6/6] Add RENAME support for PUBLICATIONs and SUBSCRIPTIONs

---
 src/backend/commands/alter.c   |  6 
 src/backend/commands/publicationcmds.c | 50 ++
 src/backend/commands/subscriptioncmds.c| 50 ++
 src/backend/parser/gram.y  | 18 +++
 src/backend/replication/logical/worker.c   | 16 +-
 src/include/commands/publicationcmds.h |  2 ++
 src/include/commands/subscriptioncmds.h|  2 ++
 src/test/regress/expected/publication.out  | 10 +-
 src/test/regress/expected/subscription.out | 10 +-
 src/test/regress/sql/publication.sql   |  6 +++-
 src/test/regress/sql/subscription.sql  |  6 +++-
 src/test/subscription/t/001_rep_changes.pl | 11 ++-
 12 files changed, 181 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 768fcc8..1a4154c 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -351,6 +351,12 @@ ExecRenameStmt(RenameStmt *stmt)
case OBJECT_TYPE:
return RenameType(stmt);
 
+   case OBJECT_PUBLICATION:
+   return RenamePublication(stmt);
+
+   case OBJECT_SUBSCRIPTION:
+   return RenameSubscription(stmt);
+
case OBJECT_AGGREGATE:
case OBJECT_COLLATION:
case OBJECT_CONVERSION:
diff --git a/src/backend/commands/publicationcmds.c 
b/src/backend/commands/publicationcmds.c
index 21e523d..8bc4e02 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -752,3 +752,53 @@ AlterPublicationOwner_oid(Oid subid, Oid newOwnerId)
 
heap_close(rel, RowExclusiveLock);
 }
+
+/*
+ * Rename the publication.
+ */
+ObjectAddress
+RenamePublication(RenameStmt *stmt)
+{
+   Oid pubid;
+   HeapTuple   tup;
+   Relationrel;
+   ObjectAddress address;
+
+   rel = heap_open(PublicationRelationId, RowExclusiveLock);
+
+   tup = SearchSysCacheCopy1(PUBLICATIONNAME,
+ 
CStringGetDatum(stmt->subname));
+
+   if (!HeapTupleIsValid(tup))
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg("publication \"%s\" does not exist", 
stmt->subname)));
+
+   /* make sure the new name doesn't exist */
+   if (OidIsValid(get_publication_oid(stmt->newname, true)))
+   ereport(ERROR,
+   (errcode(ERRCODE_DUPLICATE_SCHEMA),
+errmsg("publication \"%s\" already exists", 
stmt->newname)));
+
+   pubid = HeapTupleGetOid(tup);
+
+   /* Must be owner. */
+   if (!pg_publication_ownercheck(pubid, GetUserId()))
+   aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PUBLICATION,
+  stmt->subname);
+
+   /* rename */
+   namestrcpy(&(((Form_pg_publication) GETSTRUCT(tup))->pubname),
+  stmt->newname);
+   simple_heap_update(rel, &tup->t_self, tup);
+   CatalogUpdateIndexes(rel, tup);
+
+   InvokeObjectPostAlterHook(PublicationRelationId, pubid, 0);
+
+   ObjectAddressSet(address, PublicationRelationId, pubid);
+
+   heap_close(rel, NoLock);
+   heap_freetuple(tup);
+
+   return address;
+}
diff --git a/src/backend/commands/subscriptioncmds.c 
b/src/backend/commands/subscriptioncmds.c
index 1448ee3..56b254e 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -641,3 +641,53 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
 
heap_close(rel, RowExclusiveLock);
 }
+
+/*
+ * Rename the subscription.
+ */
+ObjectAddress
+RenameSubscription(RenameStmt *stmt)
+{
+   Oid subid;
+   HeapTuple   tup;
+   Relationrel;
+   ObjectAddress address;
+
+   rel = heap_open(SubscriptionRelationId, RowExclusiveLock);
+
+   tup = SearchS

Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Thu, Jan 19, 2017 at 6:20 AM, Stephen Frost  wrote:
> > On Wed, Jan 18, 2017 at 16:15 Robert Haas  wrote:
> >> On Wed, Jan 18, 2017 at 3:59 PM, Stephen Frost  wrote:
> >> > For non-cold standby configurations, pg_ctl is going to return just as
> >> > soon as the database has finished crash recovery, which in most cases
> >> > will probably be on the order of a few seconds.
> >>
> >> /me is poleaxed.
> >>
> >> Yeah, that was a confused sentence- most of the time it's going to return
> >> on the order of a few seconds because we're doing regular startup without
> >> having to do any crash recovery.
> >>
> >>
> >> For actual crash recovery cases, it'll take between a few seconds and
> >> checkpoint_timeout, as I described up-thread, based on the amount of
> >> outstanding WAL.
> 
> Recovering up to the minumum recovery point could take minutes!

Right, in an actual crash recovery case, it'll take longer, up to
checkpoint_timeout.

I am certainly hopeful that, most of the time, we're actually starting
up from a cleanly shut down system.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund


On January 18, 2017 3:59:00 PM PST, Robert Haas  wrote:
>On Wed, Jan 18, 2017 at 6:19 PM, Andres Freund 
>wrote:
>>>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END
>FROM tab;
>>>
>>>   It might seem that this should produce five repetitions of input
>rows
>>>   that have x > 0, and a single repetition of those that do not; but
>>>   actually it will produce five repetitions of every input row. This
>is
>>>   because generate_series() is run first, and then the CASE
>expression is
>>>   applied to its result rows. The behavior is thus comparable to
>>>
>>>   SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
>>> FROM tab, LATERAL generate_series(1,5) AS g;
>>>
>>>   It would be exactly the same, except that in this specific
>example, the
>>>   planner could choose to put g on the outside of the nestloop join,
>since
>>>   g has no actual lateral dependency on tab. That would result in a
>>>   different output row order. Set-returning functions in the select
>list
>>>   are always evaluated as though they are on the inside of a
>nestloop join
>>>   with the rest of the FROM clause, so that the function(s) are run
>to
>>>   completion before the next row from the FROM clause is considered.
>>>
>>> So is this too ugly to live, or shall we put up with it?
>>
>> I'm very tentatively in favor of living with it.
>
>So, one of the big reasons I use CASE is to avoid evaluating
>expressions in cases where they might throw an ERROR.  Like, you know:
>
>CASE WHEN d != 0 THEN n / d ELSE NULL END
>
>I guess it's not the end of the world if that only works for
>non-set-returning functions, but it's something to think about.

That's already not reliable in a bunch of cases, particularly evaluation during 
planning...  Not saying that's good, but it is.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 6:19 PM, Andres Freund  wrote:
>>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
>>
>>   It might seem that this should produce five repetitions of input rows
>>   that have x > 0, and a single repetition of those that do not; but
>>   actually it will produce five repetitions of every input row. This is
>>   because generate_series() is run first, and then the CASE expression is
>>   applied to its result rows. The behavior is thus comparable to
>>
>>   SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
>> FROM tab, LATERAL generate_series(1,5) AS g;
>>
>>   It would be exactly the same, except that in this specific example, the
>>   planner could choose to put g on the outside of the nestloop join, since
>>   g has no actual lateral dependency on tab. That would result in a
>>   different output row order. Set-returning functions in the select list
>>   are always evaluated as though they are on the inside of a nestloop join
>>   with the rest of the FROM clause, so that the function(s) are run to
>>   completion before the next row from the FROM clause is considered.
>>
>> So is this too ugly to live, or shall we put up with it?
>
> I'm very tentatively in favor of living with it.

So, one of the big reasons I use CASE is to avoid evaluating
expressions in cases where they might throw an ERROR.  Like, you know:

CASE WHEN d != 0 THEN n / d ELSE NULL END

I guess it's not the end of the world if that only works for
non-set-returning functions, but it's something to think about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Michael Paquier
On Thu, Jan 19, 2017 at 5:01 AM, Peter Eisentraut
 wrote:
> On 1/18/17 8:25 AM, Stephen Frost wrote:
>> I was actually thinking about it the other way- start out by changing
>> them to both be 5m and then document next to checkpoint_timeout (and
>> max_wal_size, perhaps...) that if you go changing those parameters (eg:
>> bumping up checkpoint_timeout to 30 minutes and max_wal_size up enough
>> that you're still checkpointing based on time and not due to running out
>> of WAL space) then you might need to consider raising the timeout for
>> pg_ctl to wait around for the server to finish going through crash
>> recovery due to all of the outstanding changes since the last
>> checkpoint.
>
> It is important for users to be aware of this, but I don't think the
> relationship between checkpoint_timeout and recovery time is linear, so
> it's unclear what the exact advice should be.

This is a right assumption for steady workloads with few DDLs, but for
example once a couple of CREATE DATABASE records are in such a law is
broken.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Michael Paquier
On Thu, Jan 19, 2017 at 6:20 AM, Stephen Frost  wrote:
> On Wed, Jan 18, 2017 at 16:15 Robert Haas  wrote:
>> On Wed, Jan 18, 2017 at 3:59 PM, Stephen Frost  wrote:
>> > For non-cold standby configurations, pg_ctl is going to return just as
>> > soon as the database has finished crash recovery, which in most cases
>> > will probably be on the order of a few seconds.
>>
>> /me is poleaxed.
>>
>> Yeah, that was a confused sentence- most of the time it's going to return
>> on the order of a few seconds because we're doing regular startup without
>> having to do any crash recovery.
>>
>>
>> For actual crash recovery cases, it'll take between a few seconds and
>> checkpoint_timeout, as I described up-thread, based on the amount of
>> outstanding WAL.

Recovering up to the minumum recovery point could take minutes!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Michael Paquier
On Thu, Jan 19, 2017 at 6:08 AM, Karl O. Pinc  wrote:
> On Wed, 18 Jan 2017 15:52:36 -0500
> Robert Haas  wrote:
>
>> On Wed, Jan 18, 2017 at 12:08 PM, Karl O. Pinc  wrote:
>> > Seems to me that the file format should
>> > be documented if there's any intention that the end user
>> > look at or otherwise use the file's content.
>> >
>> > It's fine with me if the content of current_logfiles
>> > is supposed to be internal to PG and not exposed
>> > to the end user.  I'm writing to make sure that
>> > this is a considered decision.
>>
>> On the whole, documenting it seems better than documenting it,
>> provided there's a logical place to include it in the existing
>> documentation.
>>
>> But, anyway, Michael shouldn't remove it without some explanation or
>> discussion.
>
> He explained that where it was looked clunky, it being
> inside a table that otherwise has rows that are not tall.
>
> And, it looks like I'm wrong.  The format is documented
> by way of an example in section 19.8.1. Where To Log
> under log_destination (string).
>
> Sorry for the bother.

Er, well. I kept the same detail verbosity in the docs...

> I would like to see index entries for "current_logfiles"
> so this stuff is findable.

Why not.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 16:27:53 -0700, David G. Johnston wrote:
> ​I'd rather fail now and allow for the possibility of future implementation
> of the "it might seem that..." behavior.​

That's very unlikely to happen.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-18 Thread David Rowley
On 19 January 2017 at 11:06, David Rowley  wrote:
> Old patch no longer applies, so I've attached a rebased patch. This
> also re-adds a comment line which I mistakenly removed.

(meanwhile Andres commits 69f4b9c)

I should've waited a bit longer.

Here's another that fixes the new conflicts.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f9fb276..239f78b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1312,6 +1312,26 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->verbose)
 		show_plan_tlist(planstate, ancestors, es);
 
+	/* unique join */
+	if (es->verbose || es->format != EXPLAIN_FORMAT_TEXT)
+	{
+		switch (nodeTag(plan))
+		{
+			case T_NestLoop:
+			case T_MergeJoin:
+			case T_HashJoin:
+			{
+const char *value = ((Join *) plan)->inner_unique ? "Yes" : "No";
+ExplainPropertyText("Inner Unique", value, es);
+value =  ((Join *) plan)->outer_unique ? "Yes" : "No";
+ExplainPropertyText("Outer Unique", value, es);
+break;
+			}
+			default:
+break;
+		}
+	}
+
 	/* quals, sort keys, etc */
 	switch (nodeTag(plan))
 	{
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index b41e4e2..5b75b64 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -306,10 +306,10 @@ ExecHashJoin(HashJoinState *node)
 	}
 
 	/*
-	 * In a semijoin, we'll consider returning the first
-	 * match, but after that we're done with this outer tuple.
+	 * Skip to the next outer tuple if we only need one inner
+	 * tuple to match.
 	 */
-	if (node->js.jointype == JOIN_SEMI)
+	if (node->js.match_first_inner_tuple_only)
 		node->hj_JoinState = HJ_NEED_NEW_OUTER;
 
 	if (otherqual == NIL ||
@@ -453,6 +453,14 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	hjstate->js.ps.state = estate;
 
 	/*
+	 * When the planner was able to determine that the inner side of the join
+	 * will at most contain a single tuple for each outer tuple, then we can
+	 * optimize the join by skipping to the next outer tuple after we find the
+	 * first matching inner tuple.
+	 */
+	hjstate->js.match_first_inner_tuple_only = node->join.inner_unique;
+
+	/*
 	 * Miscellaneous initialization
 	 *
 	 * create expression context for node
@@ -498,8 +506,11 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	/* set up null tuples for outer joins, if needed */
 	switch (node->join.jointype)
 	{
-		case JOIN_INNER:
 		case JOIN_SEMI:
+			/* for semi joins we match to the first inner tuple only */
+			hjstate->js.match_first_inner_tuple_only = true;
+			/* fall through */
+		case JOIN_INNER:
 			break;
 		case JOIN_LEFT:
 		case JOIN_ANTI:
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 2fd1856..378706f 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -840,10 +840,10 @@ ExecMergeJoin(MergeJoinState *node)
 	}
 
 	/*
-	 * In a semijoin, we'll consider returning the first
-	 * match, but after that we're done with this outer tuple.
+	 * Skip to the next outer tuple if we only need one inner
+	 * tuple to match.
 	 */
-	if (node->js.jointype == JOIN_SEMI)
+	if (node->js.match_first_inner_tuple_only)
 		node->mj_JoinState = EXEC_MJ_NEXTOUTER;
 
 	qualResult = (otherqual == NIL ||
@@ -1487,6 +1487,15 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 	mergestate->js.ps.state = estate;
 
 	/*
+	 * When the planner was able to determine that the inner or outer side of
+	 * the join will at most contain a single tuple for the opposing side of
+	 * the join, then we can optimize the merge join by skipping to the next
+	 * tuple since we know there are no more to match.
+	 */
+	mergestate->js.match_first_inner_tuple_only = node->join.inner_unique;
+	mergestate->js.match_first_outer_tuple_only = node->join.outer_unique;
+
+	/*
 	 * Miscellaneous initialization
 	 *
 	 * create expression context for node
@@ -1537,7 +1546,8 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 	 * only if eflags doesn't specify REWIND.
 	 */
 	if (IsA(innerPlan(node), Material) &&
-		(eflags & EXEC_FLAG_REWIND) == 0)
+		(eflags & EXEC_FLAG_REWIND) == 0 &&
+		!node->join.outer_unique)
 		mergestate->mj_ExtraMarks = true;
 	else
 		mergestate->mj_ExtraMarks = false;
@@ -1553,8 +1563,11 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 
 	switch (node->join.jointype)
 	{
-		case JOIN_INNER:
 		case JOIN_SEMI:
+			/* for semi joins we match to the first inner tuple only */
+			mergestate->js.match_first_inner_tuple_only = true;
+			/* fall through */
+		case JOIN_INNER:
 			mergestate->mj_FillOuter = false;
 			mergestate->mj_FillInner = false;
 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread David G. Johnston
On Wed, Jan 18, 2017 at 4:14 PM, Tom Lane  wrote:

> I wrote:
> > I'll try to write something about the SRF-in-CASE issue too.  Seeing
> > whether we can document that adequately seems like an important part
> > of making the decision about whether we need to block it.
>
> Here's what I came up with:
>
>   This behavior also means that set-returning functions will be evaluated
>   even when it might appear that they should be skipped because of a
>   conditional-evaluation construct, such as CASE or COALESCE. For example,
>   consider
>
>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
>
>   It might seem that this should produce five repetitions of input rows
>   that have x > 0, and a single repetition of those that do not; but
>   actually it will produce five repetitions of every input row.
>
> So is this too ugly to live, or shall we put up with it?
>
>
​Disallowing such an unlikely, and un-intuitive, corner-case strikes my
sensibilities.

​I'd rather fail now and allow for the possibility of future implementation
of the "it might seem that..." behavior.​

David J.


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 18:14:26 -0500, Tom Lane wrote:
> I wrote:
> > I'll try to write something about the SRF-in-CASE issue too.  Seeing
> > whether we can document that adequately seems like an important part
> > of making the decision about whether we need to block it.
> 
> Here's what I came up with:
> 
>   This behavior also means that set-returning functions will be evaluated
>   even when it might appear that they should be skipped because of a
>   conditional-evaluation construct, such as CASE or COALESCE. For example,
>   consider
> 
>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
> 
>   It might seem that this should produce five repetitions of input rows
>   that have x > 0, and a single repetition of those that do not; but
>   actually it will produce five repetitions of every input row. This is
>   because generate_series() is run first, and then the CASE expression is
>   applied to its result rows. The behavior is thus comparable to
> 
>   SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
> FROM tab, LATERAL generate_series(1,5) AS g;
> 
>   It would be exactly the same, except that in this specific example, the
>   planner could choose to put g on the outside of the nestloop join, since
>   g has no actual lateral dependency on tab. That would result in a
>   different output row order. Set-returning functions in the select list
>   are always evaluated as though they are on the inside of a nestloop join
>   with the rest of the FROM clause, so that the function(s) are run to
>   completion before the next row from the FROM clause is considered.
> 
> So is this too ugly to live, or shall we put up with it?

I'm very tentatively in favor of living with it.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
I wrote:
> I'll try to write something about the SRF-in-CASE issue too.  Seeing
> whether we can document that adequately seems like an important part
> of making the decision about whether we need to block it.

Here's what I came up with:

  This behavior also means that set-returning functions will be evaluated
  even when it might appear that they should be skipped because of a
  conditional-evaluation construct, such as CASE or COALESCE. For example,
  consider

  SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;

  It might seem that this should produce five repetitions of input rows
  that have x > 0, and a single repetition of those that do not; but
  actually it will produce five repetitions of every input row. This is
  because generate_series() is run first, and then the CASE expression is
  applied to its result rows. The behavior is thus comparable to

  SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
FROM tab, LATERAL generate_series(1,5) AS g;

  It would be exactly the same, except that in this specific example, the
  planner could choose to put g on the outside of the nestloop join, since
  g has no actual lateral dependency on tab. That would result in a
  different output row order. Set-returning functions in the select list
  are always evaluated as though they are on the inside of a nestloop join
  with the rest of the FROM clause, so that the function(s) are run to
  completion before the next row from the FROM clause is considered.

So is this too ugly to live, or shall we put up with it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 17:34:56 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > (I also noticed the previous patch should have had a catversion bump :(,
> > will do after the meeting).
> 
> Uh, why?  It isn't touching any on-disk data structure.

Forget what I said - I was rushing to a meeting and not thinking
entirely clearly.  Was thinking about the new node types and that we now
(de)serialize plans for parallelism - but that's guaranteed to be the
same version.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> (I also noticed the previous patch should have had a catversion bump :(,
> will do after the meeting).

Uh, why?  It isn't touching any on-disk data structure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] smallint out of range EXECUTEing prepared statement

2017-01-18 Thread David G. Johnston
On Wed, Jan 18, 2017 at 3:15 PM, Justin Pryzby  wrote:

> Is this expected behavior ?

​​
>
> ts=# SELECT * FROM t WHERE site_id=32768 LIMIT 1;
> (0 rows)
>
> ts=# PREPARE x AS SELECT * FROM t WHERE site_id=$1 LIMIT 1;
> PREPARE
> ts=# EXECUTE x(32768);
> ERROR:  smallint out of range
>

​​Probably.  If you show the definition of "t", or at least "t.site_id",
that can be confirmed.

And, IMO, this question is more in line with the purpose of the -general
list.

David J.


[HACKERS] smallint out of range EXECUTEing prepared statement

2017-01-18 Thread Justin Pryzby
Is this expected behavior ?

ts=# SELECT * FROM t WHERE site_id=32768 LIMIT 1;
(0 rows)

ts=# PREPARE x AS SELECT * FROM t WHERE site_id=$1 LIMIT 1;
PREPARE
ts=# EXECUTE x(32768);
ERROR:  smallint out of range

ts=# PREPARE y AS SELECT * FROM t WHERE site_id::int=$1 LIMIT 1;
PREPARE
ts=# EXECUTE y(32768);
(0 rows)

Note, we also sometimes get small/int out of range when SELECTing from a view,
and we end up as a workaround putting a ::big/int cast into the view or
multiplying by 1.

Thanks,
Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Grouped base relation

2017-01-18 Thread David Rowley
On 19 January 2017 at 07:32, Robert Haas  wrote:
> On Tue, Jan 17, 2017 at 11:33 PM, Ashutosh Bapat
>  wrote:
>> I don't think aggcombinefn isn't there because we couldn't write it
>> for array_agg() or string_agg(). I guess, it won't be efficient to
>> have those aggregates combined across parallel workers.
>
> I think there are many cases where it would work fine.  I assume that
> David just didn't make it a priority to write those functions because
> it wasn't important to the queries he wanted to optimize.  But
> somebody can submit a patch for it any time and I bet it will have
> practical use cases.  There might be some performance problems shoving
> large numbers of lengthy values through a shm_mq, but we won't know
> that until somebody tries it.

I had assumed that the combine function which combines a large array
or a large string would not be any cheaper than doing that
incrementally with the transfn. Of course some of this would happen in
parallel, but it still doubles up some of the memcpy()ing, so perhaps
it would be slower? ... I didn't ever get a chance to test it.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 16:56:46 -0500, Tom Lane wrote:
> Andres Freund  writes:
> I have not actually looked at 0003 at all yet.  So yeah, please post
> for review after you're done rebasing.

Here's a rebased and lightly massaged version. I'm vanishing in a
meeting for a bit, thought it'd be more useful to get it now rather than
later.

(I also noticed the previous patch should have had a catversion bump :(,
will do after the meeting).

- Andres
>From 5a0bdc9543291c051c2dbab26492f6e0320e8f82 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 18 Jan 2017 13:51:47 -0800
Subject: [PATCH] Remove obsoleted code relating to targetlist SRF evaluation.

Author: Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypow...@alap3.anarazel.de
---
 src/backend/catalog/index.c   |   3 +-
 src/backend/catalog/partition.c   |   5 +-
 src/backend/commands/copy.c   |   2 +-
 src/backend/commands/prepare.c|   3 +-
 src/backend/commands/tablecmds.c  |   3 +-
 src/backend/commands/typecmds.c   |   2 +-
 src/backend/executor/execAmi.c|  44 +-
 src/backend/executor/execQual.c   | 919 --
 src/backend/executor/execScan.c   |  30 +-
 src/backend/executor/execUtils.c  |   6 -
 src/backend/executor/nodeAgg.c|  52 +-
 src/backend/executor/nodeBitmapHeapscan.c |   2 -
 src/backend/executor/nodeCtescan.c|   2 -
 src/backend/executor/nodeCustom.c |   2 -
 src/backend/executor/nodeForeignscan.c|   2 -
 src/backend/executor/nodeFunctionscan.c   |   2 -
 src/backend/executor/nodeGather.c |  25 +-
 src/backend/executor/nodeGroup.c  |  42 +-
 src/backend/executor/nodeHash.c   |   2 +-
 src/backend/executor/nodeHashjoin.c   |  58 +-
 src/backend/executor/nodeIndexonlyscan.c  |   2 -
 src/backend/executor/nodeIndexscan.c  |  11 +-
 src/backend/executor/nodeLimit.c  |  19 +-
 src/backend/executor/nodeMergejoin.c  |  59 +-
 src/backend/executor/nodeModifyTable.c|   4 +-
 src/backend/executor/nodeNestloop.c   |  41 +-
 src/backend/executor/nodeProjectSet.c |   2 +-
 src/backend/executor/nodeResult.c |  33 +-
 src/backend/executor/nodeSamplescan.c |   8 +-
 src/backend/executor/nodeSeqscan.c|   2 -
 src/backend/executor/nodeSubplan.c|  31 +-
 src/backend/executor/nodeSubqueryscan.c   |   2 -
 src/backend/executor/nodeTidscan.c|   8 +-
 src/backend/executor/nodeValuesscan.c |   5 +-
 src/backend/executor/nodeWindowAgg.c  |  58 +-
 src/backend/executor/nodeWorktablescan.c  |   2 -
 src/backend/optimizer/util/clauses.c  |   4 +-
 src/backend/optimizer/util/predtest.c |   2 +-
 src/backend/utils/adt/domains.c   |   2 +-
 src/backend/utils/adt/xml.c   |   4 +-
 src/include/executor/executor.h   |   9 +-
 src/include/nodes/execnodes.h |  16 +-
 src/pl/plpgsql/src/pl_exec.c  |   5 +-
 43 files changed, 346 insertions(+), 1189 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index cac0cbf7d4..26cbc0e06a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1805,8 +1805,7 @@ FormIndexDatum(IndexInfo *indexInfo,
 elog(ERROR, "wrong number of index expressions");
 			iDatum = ExecEvalExprSwitchContext((ExprState *) lfirst(indexpr_item),
 			   GetPerTupleExprContext(estate),
-			   &isNull,
-			   NULL);
+			   &isNull);
 			indexpr_item = lnext(indexpr_item);
 		}
 		values[i] = iDatum;
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 874e69d8d6..6dec75b59e 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1339,7 +1339,7 @@ get_qual_for_range(PartitionKey key, PartitionBoundSpec *spec)
 			test_exprstate = ExecInitExpr(test_expr, NULL);
 			test_result = ExecEvalExprSwitchContext(test_exprstate,
 			  GetPerTupleExprContext(estate),
-	&isNull, NULL);
+	&isNull);
 			MemoryContextSwitchTo(oldcxt);
 			FreeExecutorState(estate);
 
@@ -1610,8 +1610,7 @@ FormPartitionKeyDatum(PartitionDispatch pd,
 elog(ERROR, "wrong number of partition key expressions");
 			datum = ExecEvalExprSwitchContext((ExprState *) lfirst(partexpr_item),
 			  GetPerTupleExprContext(estate),
-			  &isNull,
-			  NULL);
+			  &isNull);
 			partexpr_item = lnext(partexpr_item);
 		}
 		values[i] = datum;
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 1fd2162794..ab666b9bdd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -3395,7 +3395,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 		Assert(CurrentMemoryContext == econtext->ecxt_per_tuple_memory);
 
 		values[defmap[i]] = ExecEvalExpr(defexprs[i], econtext,
-		 &nulls[defmap[i]], NULL);
+		 &null

Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-18 Thread David Rowley
On 3 December 2016 at 10:26, Tom Lane  wrote:
> Robert Haas  writes:
>> On Dec 2, 2016, at 7:47 AM, Haribabu Kommi  wrote:
>>> Patch still applies fine to HEAD.
>>> Moved to next CF with "ready for committer" status.
>
>> Tom, are you picking this up?
>
> Yeah, I apologize for not having gotten to it in this commitfest, but
> it's definitely something I will look at.

Old patch no longer applies, so I've attached a rebased patch. This
also re-adds a comment line which I mistakenly removed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


unique_joins_2017-01-19.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> There's one sgml comment you'd added:
> "Furthermore, nested set-returning functions did not work at all."
> I'm not quite sure what you're referring to there - it was previously
> allowed to have one set argument to an SRF:

Ooops ... that was composed too hastily, evidently.  Will fix.

I'll try to write something about the SRF-in-CASE issue too.  Seeing
whether we can document that adequately seems like an important part
of making the decision about whether we need to block it.

> Working on rebasing the cleanup patch now.  Interested in reviewing
> that?  Otherwise I think I'll just push the rebased version of what I'd
> posted before, after making another pass through it.

I have not actually looked at 0003 at all yet.  So yeah, please post
for review after you're done rebasing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Karl O. Pinc
On Wed, 18 Jan 2017 15:08:09 -0600
"Karl O. Pinc"  wrote:

> I would like to see index entries for "current_logfiles"
> so this stuff is findable.

Attached is a v27 of the patch.

I polished some of the sentences in the documentation
and added index entries.

Also attached are a series of 2 patches to apply on top
of v27 which make symbols out of hardcoded constants.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 07afa3c..427980d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4218,6 +4218,11 @@ SELECT * FROM parent WHERE key = 2400;
   where to log
  
 
+ 
+   current_logfiles
+   and the log_destination configuration parameter
+ 
+
  
 
  
@@ -4248,6 +4253,27 @@ SELECT * FROM parent WHERE key = 2400;
  must be enabled to generate
 CSV-format log output.

+   
+When either stderr or
+csvlog are included, the file
+current_logfiles is created to record the location
+of the log file(s) currently in use by the logging collector and the
+associated logging destination. This provides a convenient way to
+find the logs currently in use by the instance. Here is an example of
+this file's content:
+
+stderr pg_log/postgresql.log
+csvlog pg_log/postgresql.csv
+
+
+current_logfiles is recreated when a new log file
+is created as an effect of rotation, and
+when log_destination is reloaded.  It is removed when
+neither stderr
+nor csvlog are included
+in log_destination, and when the logging collector is
+disabled.
+   
 

 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2504a46..5b0be26 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15449,6 +15449,13 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile(text)
+   text
+   Primary log file name, or log in the requested format,
+   currently in use by the logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15667,6 +15674,45 @@ SET search_path TO schema , schema, ..

 

+pg_current_logfile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+ current_logfiles
+ and the pg_current_logfile function
+   
+
+   
+Logging
+current_logfiles file and the pg_current_logfile
+function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of the log file(s) currently in use by the logging collector.
+The path includes the  directory
+and the log file name.  Log collection must be enabled or the return value
+is NULL.  When multiple log files exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply, as text,
+either csvlog or stderr as the value of the
+optional parameter. The return value is NULL when the
+log format requested is not a configured
+.  The
+pg_current_logfiles reflects the contents of the
+current_logfiles file.
+   
+
+   
 pg_my_temp_schema

 
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 5c52824..9865751 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -36,6 +36,10 @@ these required items, the cluster configuration files
 PGDATA, although it is possible to place them elsewhere.
 
 
+
+  current_logfiles
+
+
 
 Contents of PGDATA
 
@@ -61,6 +65,12 @@ Item
 
 
 
+ current_logfiles
+ File recording the log file(s) currently written to by the logging
+  collector
+
+
+
  global
  Subdirectory containing cluster-wide tables, such as
  pg_database
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 07f291b..cbeeab8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1055,6 +1055,7 @@ REVOKE EXECUTE ON FUNCTION pg_xlog_replay_pause() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_xlog_replay_resume() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_rotate_logfile() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_reload_conf() FROM public;
+REVOKE EXECUTE ON FUNCTION pg_current_logfile() FROM public;
 
 REVOKE EXECUTE ON FUNCTION pg_stat_reset() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_stat_reset_shared(text) FROM public;
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 13a0301..e6899c6 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -146,6 +146,7 @@ static char *logfile_getnam

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On 2017-01-18 15:24:32 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > Yea, have something lying around.  Let me push it then when I get back from 
> > lunch?
> 
> Sure, no sweat.

Pushed.  Yay!

There's one sgml comment you'd added:
"Furthermore, nested set-returning functions did not work at all."

I'm not quite sure what you're referring to there - it was previously
allowed to have one set argument to an SRF:

postgres[28758][1]=# SELECT generate_series(1,generate_series(1,5));
┌─┐
│ generate_series │
├─┤
│   1 │
│   1 │
│   2 │
│   1 │
│   2 │
│   3 │


Am I misunderstanding what you meant?  I left it in what I committed,
but we probably should clear up the language there.


Working on rebasing the cleanup patch now.  Interested in reviewing
that?  Otherwise I think I'll just push the rebased version of what I'd
posted before, after making another pass through it.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in hashsearch.c

2017-01-18 Thread Robert Haas
On Fri, Jan 13, 2017 at 4:38 AM, Mithun Cy  wrote:
> There is a typo in comments of function _hash_first(); Adding a fix for same.

Thanks.  I pushed a commit including this fix and some other
improvements to that comment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] fix typo in commit a4523c5

2017-01-18 Thread Tom Lane
Nikita Glukhov  writes:
> Obviously, the last line should be
> RESET enable_indexonlyscan;

Yeah, I think you're right.  Pushed, thanks!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
On Wed, Jan 18, 2017 at 16:15 Robert Haas  wrote:

> On Wed, Jan 18, 2017 at 3:59 PM, Stephen Frost  wrote:
>
> > For non-cold standby configurations, pg_ctl is going to return just as
>
> > soon as the database has finished crash recovery, which in most cases
>
> > will probably be on the order of a few seconds.
>
>
>
> /me is poleaxed.


> 


> Yeah, that was a confused sentence- most of the time it's going to return
> on the order of a few seconds because we're doing regular startup without
> having to do any crash recovery.


> For actual crash recovery cases, it'll take between a few seconds and
> checkpoint_timeout, as I described up-thread, based on the amount of
> outstanding WAL.


> Thanks!


> Stephen


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 3:59 PM, Stephen Frost  wrote:
> For non-cold standby configurations, pg_ctl is going to return just as
> soon as the database has finished crash recovery, which in most cases
> will probably be on the order of a few seconds.

/me is poleaxed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Karl O. Pinc
On Wed, 18 Jan 2017 15:52:36 -0500
Robert Haas  wrote:

> On Wed, Jan 18, 2017 at 12:08 PM, Karl O. Pinc  wrote:
> > Seems to me that the file format should
> > be documented if there's any intention that the end user
> > look at or otherwise use the file's content.
> >
> > It's fine with me if the content of current_logfiles
> > is supposed to be internal to PG and not exposed
> > to the end user.  I'm writing to make sure that
> > this is a considered decision.  
> 
> On the whole, documenting it seems better than documenting it,
> provided there's a logical place to include it in the existing
> documentation.
> 
> But, anyway, Michael shouldn't remove it without some explanation or
> discussion.

He explained that where it was looked clunky, it being
inside a table that otherwise has rows that are not tall.

And, it looks like I'm wrong.  The format is documented
by way of an example in section 19.8.1. Where To Log
under log_destination (string).

Sorry for the bother.

I would like to see index entries for "current_logfiles"
so this stuff is findable.

Regards.


Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Wed, Jan 18, 2017 at 3:43 PM, Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> >> I think we've changed the defaults to make things better for an
> >> attended startup and worse for an unattended startup.  But I think
> >> most PostgreSQL startups are probably unattended.
> >
> > I don't understand how it's worse for an unattended startup to tell the
> > init system that the database is now up and running when, in fact, it
> > isn't.
> >
> > If that isn't what you meant, then it would be really helpful if you
> > could explain a bit more what you see as being "worse" with this change
> > for unattended startup.
> 
> This seems clear as day to me, so I'm not sure what to explain.
> Anybody who has got a script that runs pg_ctl unattended mode likely
> now has to go update that script to add --no-wait.  If they don't,
> their script may hang for whatever the timeout is (currently 60
> seconds, and it sounds like Peter wants to change that to infiniity).
> If they have been wanting their script to hang all along, as you seem
> to be saying, then they'll be happy that it now does.  If, on the
> other hand, they don't want that, then they'll be sad.

Ok, you're talking about for the cold standby case here, correct?  If
so, then I agree that we should work a bit harder to make pg_ctl realize
that configuration and have it not wait for the entire timeout in that
specific situation.

That said, I, honestly, don't remember the last time I ran into a cold
standby configuration.

For non-cold standby configurations, pg_ctl is going to return just as
soon as the database has finished crash recovery, which in most cases
will probably be on the order of a few seconds.  Note that pg_ctl
retries connecting every second, it doesn't just hang and do nothing
until the timeout.

For any case where the user is actually calling pg_ctl to start the
database so that they can make connections to the database and to use it
for something, which seems much more likely, this change will mean that
they can remove the random 'sleep' they stuck in their script that they
added when they discovered that pg_ctl wasn't waiting for the database
to actually be online before returning.  Or the '--wait' they already
put in there will now be unnecessary, but that's not going to hurt
anything.  Or the loop they put in to keep trying to connect until the
database is actually up.

> In short, I bet we will get multiple reports of people getting hosed
> by this change.

I would be very surprised if we got any reports of people being hosed by
this change, but I do think we should mention it in the release notes.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 12:08 PM, Karl O. Pinc  wrote:
> Seems to me that the file format should
> be documented if there's any intention that the end user
> look at or otherwise use the file's content.
>
> It's fine with me if the content of current_logfiles
> is supposed to be internal to PG and not exposed
> to the end user.  I'm writing to make sure that
> this is a considered decision.

On the whole, documenting it seems better than documenting it,
provided there's a logical place to include it in the existing
documentation.

But, anyway, Michael shouldn't remove it without some explanation or discussion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 12:56 PM, Karl O. Pinc  wrote:
> If it were me I'd have the documentation mention
> that the pg_current_logfiles() result is
> supplied on a "best effort" basis and under
> rare conditions may be outdated.   The
> sentence in the pg_current_logfles() docs
> which reads "pg_current_logfiles reflects the contents
> of the file current_logfiles." now carries little
> meaning because the current_logfiles docs no
> longer mention that the file content may be outdated.

Generally that's going to be true of any function or SQL statement
that returns data which can change.  The only way it isn't true in
some particular case is if a function has some kind of snapshot
semantics or returns with a lock on the underlying object held.  So it
doesn't seem particularly noteworthy here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 3:43 PM, Stephen Frost  wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
>> On Tue, Jan 17, 2017 at 5:31 PM, Stephen Frost  wrote:
>> > If I'm understanding your concern correctly, you're worried about the
>> > case of a cold standby where the database is only replaying WAL but not
>> > configured to come up as a hot standby and therefore PQping() won't ever
>> > succeed?
>>
>> I think we've changed the defaults to make things better for an
>> attended startup and worse for an unattended startup.  But I think
>> most PostgreSQL startups are probably unattended.
>
> I don't understand how it's worse for an unattended startup to tell the
> init system that the database is now up and running when, in fact, it
> isn't.
>
> If that isn't what you meant, then it would be really helpful if you
> could explain a bit more what you see as being "worse" with this change
> for unattended startup.

This seems clear as day to me, so I'm not sure what to explain.
Anybody who has got a script that runs pg_ctl unattended mode likely
now has to go update that script to add --no-wait.  If they don't,
their script may hang for whatever the timeout is (currently 60
seconds, and it sounds like Peter wants to change that to infiniity).
If they have been wanting their script to hang all along, as you seem
to be saying, then they'll be happy that it now does.  If, on the
other hand, they don't want that, then they'll be sad.

In short, I bet we will get multiple reports of people getting hosed
by this change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Jan 17, 2017 at 5:31 PM, Stephen Frost  wrote:
> > If I'm understanding your concern correctly, you're worried about the
> > case of a cold standby where the database is only replaying WAL but not
> > configured to come up as a hot standby and therefore PQping() won't ever
> > succeed?
> 
> I think we've changed the defaults to make things better for an
> attended startup and worse for an unattended startup.  But I think
> most PostgreSQL startups are probably unattended.

I don't understand how it's worse for an unattended startup to tell the
init system that the database is now up and running when, in fact, it
isn't.

If that isn't what you meant, then it would be really helpful if you
could explain a bit more what you see as being "worse" with this change
for unattended startup.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] postgres_fdw bug in 9.6

2017-01-18 Thread Robert Haas
On Fri, Jan 13, 2017 at 6:22 AM, Etsuro Fujita
 wrote:
> My biggest concern about GetExistingLocalJoinPath is that might not be
> extendable to the case of foreign-join paths with parameterization; in which
> case, fdw_outerpath for a given foreign-join path would need to have the
> same parameterization as the foreign-join path, but there might not be any
> existing paths with the same parameterization in the path list.

I agree that this is a problem.

> You might
> think we could get the fdw_outerpath by getting an existing path with no
> parameterization as in GetExistingLocalJoinPath and then modifying the
> path's param_info to match the parameterization of the foreign-join path.  I
> don't know that really works, but that might be inefficient.

I am not sure about this.

> What I have in mind to support foreign-join paths with parameterization for
> postgres_fdw like this: (1) generate parameterized paths from any joinable
> combination of the outer/inner cheapest-parameterized paths that have pushed
> down the outer/inner relation to the remote server, in a similar way as
> postgresGetForeignJoinPaths creates unparameterized paths, and (2) create
> fdw_outerpath for each parameterized path from the outer/inner paths used to
> generate the parameterized path, by create_nestloop_path (or,
> create_hashjoin_path or create_mergejoin_path if full join), so that the
> resulting fdw_outerpath has the same parameterization as the paramterized
> path.  This would probably work and might be more efficient.  And the patch
> I proposed would be easily extended to this, by replacing the outer/inner
> cheapest-total paths with the outer/inner cheapest-parameterized paths.
> Attached is the latest version of the patch.

Yes, I think that's broadly the approach Tom was recommending.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb_delete with arrays

2017-01-18 Thread Magnus Hagander
On Wed, Jan 18, 2017 at 5:49 AM, Michael Paquier 
wrote:

> On Tue, Jan 17, 2017 at 8:45 PM, Magnus Hagander 
> wrote:
> > On Tue, Jan 17, 2017 at 8:25 AM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >> On Sun, Dec 18, 2016 at 1:27 AM, Dmitry Dolgov <9erthali...@gmail.com>
> >> wrote:
> >> > * use variadic arguments for `jsonb_delete_array`. For rare cases,
> when
> >> > someone decides to use this function directly instead of corresponding
> >> > operator. It will be more consistent with `jsonb_delete` from my point
> >> > of
> >> > view, because it's transition from `jsonb_delete(data, 'key')` to
> >> > `jsonb_delete(data, 'key1', 'key2')` is more smooth, than to
> >> > `jsonb_delete(data, '{key1, key2}')`.
> >>
> >> That's a good idea.
> >
> > I can see the point of that. In the particular usecase I built it for
> > originally though, the list of keys came from the application, in which
> case
> > binding them as an array was a lot more efficient (so as not to require a
> > whole lot of different prepared statements, one for each number of
> > parameters). But that should be workaround-able using the VARIADIC
> keyword
> > in the caller. Or by just using the operator.
>
> Yes that should be enough:
> =# select jsonb_delete('{"a":1 , "b":2, "c":3}', 'a', 'b', 'c');
>  jsonb_delete
> --
>  {}
> (1 row)
> =# select '{"a":1 , "b":2, "c":3}'::jsonb - '{a,b}'::text[];
>  ?column?
> --
>  {"c": 3}
> (1 row)
> That's a nice bonus, perhaps that's not worth documenting as most
> users will likely care only about the operator.
>
> >> > I've attached a patch with these modifications. What do you think?
> >>
> >> Looking at both patches proposed, documentation is still missing in
> >> the list of jsonb operators as '-' is missing for arrays. I am marking
> >> this patch as waiting on author for now.
> >
> > Added in updated patch. Do you see that as enough, or do we need it in
> some
> > more places in the docs as well?
>
> I am not seeing other places to update, thanks.
>
> Another victim of 352a24a... Your patch is failing to apply because
> now the headers of the functions is generated automatically. And the
> OIDs have been taken recently. I have fixed that to test your patch,
> the result is attached. The patch is marked as ready for committer.
>

Thanks! I had already rebased it,  so I pushed that version (picked
different oids).

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Robert Haas
On Tue, Jan 17, 2017 at 5:31 PM, Stephen Frost  wrote:
> If I'm understanding your concern correctly, you're worried about the
> case of a cold standby where the database is only replaying WAL but not
> configured to come up as a hot standby and therefore PQping() won't ever
> succeed?

I think we've changed the defaults to make things better for an
attended startup and worse for an unattended startup.  But I think
most PostgreSQL startups are probably unattended.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 3:12 PM, Robert Haas  wrote:
> On Tue, Jan 10, 2017 at 6:06 AM, Amit Langote
>  wrote:
>> [ updated patches ]
>
> I committed 0004 and also fixed the related regression test not to
> rely on DROP .. CASCADE, which isn't always stable.  The remainder of
> this patch set needs a rebase, and perhaps you could also fold in
> other pending partitioning fixes so I have everything to look at it in
> one place.

Just to be a little more clear, I don't mind multiple threads each
with a patch or patch set so much, but multiple patch sets on the same
thread gets hard for me to track.  Sorry for the inconvenience.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning vs. BulkInsertState

2017-01-18 Thread Robert Haas
On Wed, Jan 11, 2017 at 10:53 PM, Amit Langote
 wrote:
> On 2017/01/06 20:23, Amit Langote wrote:
>> On 2017/01/05 3:26, Robert Haas wrote:
>>> It's unclear to me why we need to do 0002.  It doesn't seem like it
>>> should be necessary, it doesn't seem like a good idea, and the commit
>>> message you proposed is uninformative.
>>
>> If a single BulkInsertState object is passed to
>> heap_insert()/heap_multi_insert() for different heaps corresponding to
>> different partitions (from one input tuple to next), tuples might end up
>> going into wrong heaps (like demonstrated in one of the reports [1]).  A
>> simple solution is to disable bulk-insert in case of partitioned tables.
>>
>> But my patch (or its motivations) was slightly wrongheaded, wherein I
>> conflated multi-insert stuff and bulk-insert considerations.  I revised
>> 0002 to not do that.
>
> Ragnar Ouchterlony pointed out [1] on pgsql-bugs that 0002 wasn't correct.
> Attaching updated 0002 along with rebased 0001 and 0003.

The BulkInsertState is not there only to improve performance.  It's
also there to make sure we use a BufferAccessStrategy, so that we
don't trash the whole buffer arena.  See commit
85e2cedf985bfecaf43a18ca17433070f439fb0e.  If a partitioned table uses
a separate BulkInsertState for each partition, I believe it will also
end up using a separate ring of buffers for every partition.  That may
well be faster than copying into an unpartitioned table in some cases,
because dirtying everything in the buffer arena without actually
writing any of those buffers is a lot faster than actually doing the
writes.  But it is also anti-social behavior; we have
BufferAccessStrategy objects for a reason.

One idea would be to have each partition use a separate
BulkInsertState but have them point to the same underlying
BufferAccessStrategy, but even that's problematic, because it could
result in us holding a gigantic number of pins (one per partition). I
think maybe a better idea would be to add an additional function
ReleaseBulkInsertStatePin() which gets called whenever we switch
relations, and then just use the same BulkInsertState throughout.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> Yea, have something lying around.  Let me push it then when I get back from 
> lunch?

Sure, no sweat.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 1/18/17 8:25 AM, Stephen Frost wrote:
> > I was actually thinking about it the other way- start out by changing
> > them to both be 5m and then document next to checkpoint_timeout (and
> > max_wal_size, perhaps...) that if you go changing those parameters (eg:
> > bumping up checkpoint_timeout to 30 minutes and max_wal_size up enough
> > that you're still checkpointing based on time and not due to running out
> > of WAL space) then you might need to consider raising the timeout for
> > pg_ctl to wait around for the server to finish going through crash
> > recovery due to all of the outstanding changes since the last
> > checkpoint.
> 
> It is important for users to be aware of this, but I don't think the
> relationship between checkpoint_timeout and recovery time is linear, so
> it's unclear what the exact advice should be.

I don't understand what I'm missing when it comes to checkpoint_timeout
and the time required to recover from a crash.  You aren't the first
person to question that association, but it seems pretty clear to me.

When doing recovery, we have to replay everything since the last
checkpoint.  If we are checkpointing at least every 5 minutes then we
can't have any more than 5 minutes worth of WAL to replay, right?

We could certainly have *less* WAL to replay after a crash, but we
shouldn't ever have more, which makes checkpoint_timeout an upper bound
on replay time.

Now, if max_wal_size is set such that you don't have enough room in the
WAL to store all of the changes and a checkpoint is forced early, then
your recovery time will be based on how fast your system can replay
max_wal_size amount of data, but even in that case, it can't be more
than checkpoint_timeout, so that still serves as an upper bound.

I think all I'm pointing out here is that we should make it clear that
checkpoint_timeout serves as an upper bound on recovery time.
Increasing it *could* lead to recovery taking longer and users should be
aware of that.  If they want to have a good handle on how long recovery
is *likely* to take on their system then they'd need to measure their
WAL rate and test their hardware to see how fast WAL is able to be
replayed.

> Personally, I think the timeout in pg_ctl is wrong and needs to be
> disabled in practical applications, but that is a different discussion.

I have to admit that the pg_ctl timeout does seem a bit odd.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Declarative partitioning - another take

2017-01-18 Thread Robert Haas
On Tue, Jan 10, 2017 at 6:06 AM, Amit Langote
 wrote:
> [ updated patches ]

I committed 0004 and also fixed the related regression test not to
rely on DROP .. CASCADE, which isn't always stable.  The remainder of
this patch set needs a rebase, and perhaps you could also fold in
other pending partitioning fixes so I have everything to look at it in
one place.

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On January 18, 2017 12:00:12 PM PST, Tom Lane  wrote:
>Andres Freund  writes:
>> On 2017-01-18 14:24:12 -0500, Tom Lane wrote:
>>> So I think we can push this patch now and get on with the downstream
>>> patches.  Do you want to do the honors, or shall I?
>
>> Whatever you prefer - either way, I'll go on to rebasing the cleanup
>> patch afterwards (whose existance should probably be mentioned in the
>> commit message).
>
>OK, I can do it --- I have the revised patch already queued up in git
>stash, so it's easy.  Need to write a commit msg though.  Did you have
>a draft for that?

Yea, have something lying around.  Let me push it then when I get back from 
lunch?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] fix typo in commit a4523c5

2017-01-18 Thread Nikita Glukhov

Attached patch fixes typo in regression test 
src/test/regress/sql/create_index.sql
that was introduced in commit a4523c5
("Improve planning of btree index scans using ScalarArrayOpExpr quals.").

In this commit the following lines were added to create_index.sql:

SET enable_indexonlyscan = OFF;
...
RESET enable_indexscan;


Obviously, the last line should be

RESET enable_indexonlyscan;

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index c6dfb26..e519fdb 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2940,7 +2940,7 @@ ORDER BY thousand;
 1 | 1001
 (2 rows)
 
-RESET enable_indexscan;
+RESET enable_indexonlyscan;
 --
 -- Check elimination of constant-NULL subexpressions
 --
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 822c34a..1648072 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1001,7 +1001,7 @@ SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
 
-RESET enable_indexscan;
+RESET enable_indexonlyscan;
 
 --
 -- Check elimination of constant-NULL subexpressions

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Peter Eisentraut
On 1/18/17 8:25 AM, Stephen Frost wrote:
> I was actually thinking about it the other way- start out by changing
> them to both be 5m and then document next to checkpoint_timeout (and
> max_wal_size, perhaps...) that if you go changing those parameters (eg:
> bumping up checkpoint_timeout to 30 minutes and max_wal_size up enough
> that you're still checkpointing based on time and not due to running out
> of WAL space) then you might need to consider raising the timeout for
> pg_ctl to wait around for the server to finish going through crash
> recovery due to all of the outstanding changes since the last
> checkpoint.

It is important for users to be aware of this, but I don't think the
relationship between checkpoint_timeout and recovery time is linear, so
it's unclear what the exact advice should be.

Personally, I think the timeout in pg_ctl is wrong and needs to be
disabled in practical applications, but that is a different discussion.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> On 2017-01-18 14:24:12 -0500, Tom Lane wrote:
>> So I think we can push this patch now and get on with the downstream
>> patches.  Do you want to do the honors, or shall I?

> Whatever you prefer - either way, I'll go on to rebasing the cleanup
> patch afterwards (whose existance should probably be mentioned in the
> commit message).

OK, I can do it --- I have the revised patch already queued up in git
stash, so it's easy.  Need to write a commit msg though.  Did you have
a draft for that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-01-18 Thread Petr Jelinek
On 18/01/17 17:35, Erik Rijkers wrote:
> On 2017-01-18 14:46, Petr Jelinek wrote:
> 
>> 0001-Logical-replication-support-for-initial-data-copy-v2.patch
> 
> Applies and builds fine on top of the previous 5 patches.
> 
> Two problems:
> 
> 1.  alter_subscription.sgml has an unpaired -tag, which breaks
> the doc-build:
> This is the offending patch-line:
> +  CREATE SUBSCRIPTION with COPY
> DATA
>

Hmm, I wonder how did that compile on my machine as it's indeed syntax
error.

> 
> 2. Running the below (a version of the earlier pgbench_derail.sh) I have
> found that
>create subscription sub1  ..  with (disabled);  and then  alter
> subscription sub1 enable;
> cannot be run immediately, consecutively.  The error is avoided when the
> two
> commands are separated (for instance, below in separate psql- calls).
> 
> I don't understand why this is but it is reliably so.
>

AFAICS you should always get error from that test after you enable the
subscription, no matter if you enable immediately or later. The default
behavior is to copy the data and your test already copies them via
pg_dump/pg_restore.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On 2017-01-18 14:24:12 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-01-18 08:43:24 -0500, Tom Lane wrote:
> >> ... except for one thing.  The more I look at it,
> >> the more disturbed I am by the behavioral change shown in rangefuncs.out
> >> --- that's the SRF-in-one-arm-of-CASE issue.
> 
> > I'm fine with leaving it as is in the patch, but I'm also fine with
> > changing things to ERROR.  Personally I don't think it matters much, and
> > we can whack it back and forth as we want later.  Thus I'm inclined to
> > commit it without erroring out; since presumably we'll take some time
> > deciding on what exactly we want to prohibit.
> 
> I agree.  If we do decide to throw an error, it would best be done in
> parse analysis, and thus would be practically independent of this patch
> anyway.

Cool, agreed then.


> >> * This bit in ExecProjectSRF was no good:
> >> + else if (IsA(gstate->arg, FuncExprState) &&
> >> +  ((FuncExpr *) gstate->arg->expr)->funcretset)
> 
> > Argh. That should have been FunExprState->func->fn_retset.
> 
> Nope; that was my first thought as well, but fn_retset isn't valid if
> init_fcache hasn't been run yet, which it won't have been the first time
> through.

Righty-O :(


> So I think we can push this patch now and get on with the downstream
> patches.  Do you want to do the honors, or shall I?

Whatever you prefer - either way, I'll go on to rebasing the cleanup
patch afterwards (whose existance should probably be mentioned in the
commit message).


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning vs. information_schema

2017-01-18 Thread Robert Haas
On Tue, Jan 10, 2017 at 4:17 AM, Amit Langote
 wrote:
> On 2017/01/10 14:44, Keith Fiske wrote:
>> Is there any reason for the exclusion of parent tables from the pg_tables
>> system catalog view? They do not show up in information_schema.tables as
>> well. I believe I found where to make the changes and I tested to make sure
>> it works for my simple case. Attached is my first attempt at patching
>> anything in core. Not sure if there's anywhere else this would need to be
>> fixed.
>
> That's an oversight.  The original partitioning patch didn't touch
> information_schema.sql and system_views.sql at all.  I added the relkind =
> 'P' check in some other views as well, including what your patch considered.

I took a look at this patch:

* The SQL for pg_seclabels contained an obvious syntax error.

* pg_seclabels should only return object types that could be used in
the SECURITY LABEL command, so it needs to return 'table' not
'partitioned table' for the new relkind.

* There's a pointless change from v.relkind = 'v' to v.relkind IN ('v').

* I don't see any indication on the Internet that "PARTITIONED TABLE"
is a legal value for the table type in information_schema.tables.
Unless we can find something official, I suppose we should just
display BASE TABLE in that case as we do in other cases.  I wonder if
the schema needs some broader revision; for example, are there
information_schema elements intended to show information about
partitions?

* Even though unique/primary key/foreign key constraints can't
currently be defined on partitioned tables, it seems best to change
the related reference symmetrically with the others, because we might
support it in the future and then this would break again.  Similarly
for key_column_usage.

Committed with fixes for those issues.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> On 2017-01-18 08:43:24 -0500, Tom Lane wrote:
>> ... except for one thing.  The more I look at it,
>> the more disturbed I am by the behavioral change shown in rangefuncs.out
>> --- that's the SRF-in-one-arm-of-CASE issue.

> I'm fine with leaving it as is in the patch, but I'm also fine with
> changing things to ERROR.  Personally I don't think it matters much, and
> we can whack it back and forth as we want later.  Thus I'm inclined to
> commit it without erroring out; since presumably we'll take some time
> deciding on what exactly we want to prohibit.

I agree.  If we do decide to throw an error, it would best be done in
parse analysis, and thus would be practically independent of this patch
anyway.

>> * This bit in ExecProjectSRF was no good:
>> + else if (IsA(gstate->arg, FuncExprState) &&
>> +  ((FuncExpr *) gstate->arg->expr)->funcretset)

> Argh. That should have been FunExprState->func->fn_retset.

Nope; that was my first thought as well, but fn_retset isn't valid if
init_fcache hasn't been run yet, which it won't have been the first time
through.

So I think we can push this patch now and get on with the downstream
patches.  Do you want to do the honors, or shall I?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On 2017-01-18 08:43:24 -0500, Tom Lane wrote:
> I did a review pass over 0001 and 0002.  I think the attached updated
> version is committable

Cool.

> ... except for one thing.  The more I look at it,
> the more disturbed I am by the behavioral change shown in rangefuncs.out
> --- that's the SRF-in-one-arm-of-CASE issue.  (The changes in tsrf.out
> are fine and as per agreement.)  We touched lightly on that point far
> upthread, but didn't really resolve it.  What's bothering me is that
> we're changing, silently, from a reasonably-intuitive behavior to a
> completely-not-intuitive one.  Since we got a bug report for the previous
> less-than-intuitive behavior for such cases, it's inevitable that we'll
> get bug reports for this.  I think it'd be far better to throw error for
> SRF-inside-a-CASE.  If we don't, we certainly need to document this,
> and I'm not very sure how to explain it clearly.

I'm fine with leaving it as is in the patch, but I'm also fine with
changing things to ERROR.  Personally I don't think it matters much, and
we can whack it back and forth as we want later.  Thus I'm inclined to
commit it without erroring out; since presumably we'll take some time
deciding on what exactly we want to prohibit.


> Anyway, I've not done anything about that in the attached.  What I did do:
> 
> * Merge 0001 and 0002.  I appreciate you having separated that for my
> review, but it doesn't make any sense to commit the parts of 0001 that
> you undid in 0002.

Right. I was suggesting upthread that we'd merge them before committing.


> * Obviously, ExecMakeFunctionResultSet can be greatly simplified now
> that it need not deal with hasSetArg cases.

Yea, I've cleaned it up in my 0003; where it would have started to error
out too (without an explicit check), because there's no set evaluating
function anymore besides ExecMakeFunctionResultSet.


> I saw you'd left that
> for later, which is mostly fine, but I did lobotomize it just enough
> to throw an error if it gets a set result from an argument.  Without
> that, we wouldn't really be testing that the planner splits nested
> SRFs correctly.

Ok, that makes sense.


> * This bit in ExecProjectSRF was no good:
> 
> + else if (IsA(gstate->arg, FuncExprState) &&
> +  ((FuncExpr *) gstate->arg->expr)->funcretset)
> 
> because FuncExprState is used for more node types than just FuncExpr;
> in particular this would fail (except perhaps by accident) for a
> set-returning OpExpr.

Argh. That should have been FunExprState->func->fn_retset.  Anyway, your
approach works, too.


> * Update the user documentation (didn't address the CASE issue, though).

Cool.


Greetings,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-01-18 Thread Robert Haas
On Mon, Jan 16, 2017 at 7:11 AM, Amit Kapila  wrote:
> Fixed.

With respect to the second patch
(parallel_index_opt_exec_support_v4.patch), I'm hoping it can use the
new function from Dilip's bitmap heap scan patch set.  See commit
716c7d4b242f0a64ad8ac4dc48c6fed6557ba12c.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel bitmap heap scan

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 12:14 AM, Dilip Kumar  wrote:
> On Fri, Jan 13, 2017 at 6:36 PM, Dilip Kumar  wrote:
>>
>> Please verify with the new patch.
>
> Patch 0001 and 0003 required to rebase on the latest head. 0002 is
> still the same.

I've committed the first half of 0001.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Grouped base relation

2017-01-18 Thread Robert Haas
On Tue, Jan 17, 2017 at 11:33 PM, Ashutosh Bapat
 wrote:
> I don't think aggcombinefn isn't there because we couldn't write it
> for array_agg() or string_agg(). I guess, it won't be efficient to
> have those aggregates combined across parallel workers.

I think there are many cases where it would work fine.  I assume that
David just didn't make it a priority to write those functions because
it wasn't important to the queries he wanted to optimize.  But
somebody can submit a patch for it any time and I bet it will have
practical use cases.  There might be some performance problems shoving
large numbers of lengthy values through a shm_mq, but we won't know
that until somebody tries it.

> Also, the point is naming that kind of function as aggtransmultifn
> would mean that it's always supposed to multiply, which isn't true for
> all aggregates.

TransValue * integer = newTransValue is well-defined for any
aggregate.  It's the result of aggregating that TransValue with itself
a number of times defined by the integer.  And that might well be
significantly faster than using aggcombinefn many times.  On the other
hand, how many queries just sit there are re-aggregate the same
TransValues over and over again?  I am having trouble wrapping my head
around that part of this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Karl O. Pinc
On Wed, 18 Jan 2017 11:08:23 -0600
"Karl O. Pinc"  wrote:

> Hi Micheal,
> 
> On Wed, 18 Jan 2017 10:26:43 -0600
> "Karl O. Pinc"  wrote:
> 
> > > v26 patch attached which fixes this.
> 
> I was glancing over the changes to the documentation
> you made between the v22 and v25

If it were me I'd have the documentation mention
that the pg_current_logfiles() result is
supplied on a "best effort" basis and under
rare conditions may be outdated.   The
sentence in the pg_current_logfles() docs
which reads "pg_current_logfiles reflects the contents 
of the file current_logfiles." now carries little
meaning because the current_logfiles docs no
longer mention that the file content may be outdated.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function transform optimizations versus reality

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 12:10 PM, Tom Lane  wrote:
> More generally, this is the second serious bug we've found in the last
> month in the "transform" optimizations (see also bug #14479 and commit
> f0774abde).  I'm starting to get the feeling that that idea was an
> attractive nuisance --- at the very least, it would behoove us to go
> through all the transform functions with a gimlet eye.

So, two things for perspective:

1. Transform functions have been around for just over 5 years now; if
the two bugs we have had in the last month are the only ones, that's a
better track record than most features.  If we've had one a month for
5 years, that's terrible, of course, but I don't think that's so.

2. The value of these transform functions is principally in that table
rewrites can be avoided when executing ALTER TABLE.  And that is a big
value.  People are still looking for ways to further reduce table
rewrites, as evidenced for example by Serge Rielau's patch for
implicit defaults.

Of course, it is entirely possible that Noah and I missed some
important things when we added that feature, and an audit of the code
is very welcome.  But I would discourage you from disabling any more
of the optimization than necessary, because I think people really care
about whether ALTER TABLE has to rewrite the table or just lock it
long enough to change the metadata.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-01-18 Thread Robert Haas
On Tue, Jan 17, 2017 at 8:54 PM, Michael Paquier
 wrote:
> I think that we could get a committer look at that at the least.

This is sort of awkward, because it would be nice to reuse the code
for the existing SHOW command rather than reinventing the wheel, but
it's not very easy to do that, primarily because there are a number of
places which rely on being able to do catalog access, which is not
possible with a replication connection in hand.  I got it working
after hacking various things, so I have a complete list of the
problems involved:

1. ShowGUCConfigOption() calls TupleDescInitEntry(), which does a
catcache lookup to get the types pg_type entry.  This isn't any big
problem; I hacked around it by adding a TupleDescInitBuiltinEntry()
which knows about the types that guc.c (and likely other builtins)
care about.

2. SendRowDescriptionMessage calls getBaseTypeAndTypmod(), which does
a catcache lookup to figure out whether the type is a domain.  I
short-circuited it by having it assume anything with an OID less than
1 is not a domain.

3. printtup_prepare_info calls getTypeOutputInfo(), which does a
catcache lookup to figure out the type output function's OID and
whether it's a varlena.  I bypassed that with an unspeakable hack.

4. printtup.c's code in general assumes that a DR_printtup always has
a portal.  It doesn't seem to mind if the portal doesn't contain
anything very meaningful, but it has to have one.  This problem has
nothing to do with catalog access, but it's a problem.  I solved it by
(surprise) creating a portal, but I am not sure that's a very good
idea.

Problems 2-4 actually have to do with a DestReceiver of type
DestRemote really, really wanting to have an associated Portal and
database connection, so one approach is to create a stripped-down
DestReceiver that doesn't care about those things and then passing
that to GetPGVariable.  That's not any less code than the way Beena
coded it, of course; it's probably more.  On the other hand, the
stripped-down DestReciever implementation is more likely to be usable
the next time somebody wants to add a new replication command, whereas
this ad-hoc code to directly construct protocol messages will not be
reusable.

Opinions?  (Hacked-up patch attached for educational purposes.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


grotty-replication-show-hack.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Function transform optimizations versus reality

2017-01-18 Thread Tom Lane
I looked into bug #14504,
https://www.postgresql.org/message-id/20170118144828.1432.52...@wrigleys.postgresql.org

The problem is that timestamp_zone_transform() has the cute idea that
it can simplify timezone('UTC', timestamptzvalue) into a RelabelType
that claims the timestamptzvalue is just type timestamp.  Well, that's
just too cute, because it confuses the index machinery completely.
What _bt_first() sees is an equality comparison between an index
column that it knows is timestamptz, and a RHS constant that it knows
is timestamp, so it selects timestamptz_cmp_timestamp() as the
comparison function to use.  And that function will apply a timezone
rotation, resulting in the wrong answers.

You could blame this on the fact that the planner will ignore the
RelabelType when trying to match the qual to indexes.  But that hack has
fifteen years' seniority on this one, and removing it would break (at
least) use of indexes on varchar columns, so I'm not planning on fixing
this that way.  I think the only realistic fix is to disable this
optimization in timestamp_zone_transform; certainly that's the only way
I'd be comfortable with back-patching.

(Another reason for being unhappy about this optimization is that EXPLAIN
will print the resulting expression tree as timestamptzvalue::timestamp,
which is a completely misleading description; an actual cast like that
would not behave this way.)

More generally, this is the second serious bug we've found in the last
month in the "transform" optimizations (see also bug #14479 and commit
f0774abde).  I'm starting to get the feeling that that idea was an
attractive nuisance --- at the very least, it would behoove us to go
through all the transform functions with a gimlet eye.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Karl O. Pinc
Hi Micheal,

On Wed, 18 Jan 2017 10:26:43 -0600
"Karl O. Pinc"  wrote:

> > v26 patch attached which fixes this.  

I was glancing over the changes to the documentation
you made between the v22 and v25 and from looking at the diffs 
it seems the format of the current_logfiles file content is no longer
documented.  Seems to me that the file format should
be documented if there's any intention that the end user
look at or otherwise use the file's content.

It's fine with me if the content of current_logfiles
is supposed to be internal to PG and not exposed
to the end user.  I'm writing to make sure that
this is a considered decision.

I also see that all the index entries in the docs
to the current_logfiles file were removed.  (And
I think maybe some index entries to various places
where pg_current_logfiles() was mentioned in the docs.)
I see no reason why we shouldn't have more rather
than fewer index entries in the docs.

I haven't made any sort of though review of your
changes to the docs but this jumped out at me
and I wanted to comment before the patches got
passed on to the committers.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-01-18 Thread Erik Rijkers

On 2017-01-18 14:46, Petr Jelinek wrote:


0001-Logical-replication-support-for-initial-data-copy-v2.patch


Applies and builds fine on top of the previous 5 patches.

Two problems:

1.  alter_subscription.sgml has an unpaired -tag, which breaks 
the doc-build:

This is the offending patch-line:
+  CREATE SUBSCRIPTION with COPY 
DATA



2. Running the below (a version of the earlier pgbench_derail.sh) I have 
found that
   create subscription sub1  ..  with (disabled);  and then  alter 
subscription sub1 enable;
cannot be run immediately, consecutively.  The error is avoided when the 
two

commands are separated (for instance, below in separate psql- calls).

I don't understand why this is but it is reliably so.

The error(s):
2017-01-18 17:26:56.126 CET 24410 LOG:  starting logical replication 
worker for subscription "sub1"
2017-01-18 17:26:56.132 CET 26291 LOG:  logical replication apply for 
subscription sub1 started
2017-01-18 17:26:56.139 CET 26291 LOG:  starting logical replication 
worker for subscription "sub1"
2017-01-18 17:26:56.145 CET 26295 LOG:  logical replication sync for 
subscription sub1, table pgbench_accounts started
2017-01-18 17:26:56.534 CET 26295 ERROR:  duplicate key value violates 
unique constraint "pgbench_accounts_pkey"

2017-01-18 17:26:56.534 CET 26295 DETAIL:  Key (aid)=(1) already exists.
2017-01-18 17:26:56.534 CET 26295 CONTEXT:  COPY pgbench_accounts, line 
1
2017-01-18 17:26:56.536 CET 21006 LOG:  worker process: logical 
replication worker 41015 sync 40991 (PID 26295) exited with exit code 1
2017-01-18 17:26:56.536 CET 26291 LOG:  starting logical replication 
worker for subscription "sub1"
2017-01-18 17:26:56.542 CET 26297 LOG:  logical replication sync for 
subscription sub1, table pgbench_branches started
2017-01-18 17:26:57.015 CET 26297 ERROR:  duplicate key value violates 
unique constraint "pgbench_branches_pkey"

2017-01-18 17:26:57.015 CET 26297 DETAIL:  Key (bid)=(1) already exists.
2017-01-18 17:26:57.015 CET 26297 CONTEXT:  COPY pgbench_branches, line 
1
2017-01-18 17:26:57.017 CET 21006 LOG:  worker process: logical 
replication worker 41015 sync 40994 (PID 26297) exited with exit code 1
2017-01-18 17:26:57.017 CET 26291 LOG:  starting logical replication 
worker for subscription "sub1"
2017-01-18 17:26:57.023 CET 26299 LOG:  logical replication sync for 
subscription sub1, table pgbench_history started
2017-01-18 17:26:57.487 CET 26299 LOG:  logical replication 
synchronization worker finished processing
2017-01-18 17:26:57.488 CET 26291 LOG:  starting logical replication 
worker for subscription "sub1"
2017-01-18 17:26:57.491 CET 26301 LOG:  logical replication sync for 
subscription sub1, table pgbench_tellers started
2017-01-18 17:26:57.948 CET 26301 ERROR:  duplicate key value violates 
unique constraint "pgbench_tellers_pkey"

2017-01-18 17:26:57.948 CET 26301 DETAIL:  Key (tid)=(1) already exists.
2017-01-18 17:26:57.948 CET 26301 CONTEXT:  COPY pgbench_tellers, line 1
etc, etc.









#!/bin/sh

# assumes both instances are running, initially without publication or 
subscription


unset PGSERVICEFILE PGSERVICE PGPORT PGDATA PGHOST
env | grep PG

PGDATABASE=testdb

# clear logs
echo > 
/home/aardvark/pg_stuff/pg_installations/pgsql.logical_replication/logfile.logical_replication
echo > 
/home/aardvark/pg_stuff/pg_installations/pgsql.logical_replication2/logfile.logical_replication2


port1=6972
port2=6973

function cb()
{
  #  display the 4 pgbench tables' accumulated content as md5s
  #  a,b,t,h stand for:  pgbench_accounts, -branches, -tellers, -history
  for port in $port1 $port2
  do
md5_a=$(echo "select * from pgbench_accounts order by aid"|psql 
-qtAXp$port|md5sum|cut -b 1-9)
md5_b=$(echo "select * from pgbench_branches order by bid"|psql 
-qtAXp$port|md5sum|cut -b 1-9)
md5_t=$(echo "select * from pgbench_tellers  order by tid"|psql 
-qtAXp$port|md5sum|cut -b 1-9)
md5_h=$(echo "select * from pgbench_history  order by hid"|psql 
-qtAXp$port|md5sum|cut -b 1-9)
cnt_a=$(echo "select count(*) from pgbench_accounts"|psql -qtAXp 
$port)
cnt_b=$(echo "select count(*) from pgbench_branches"|psql -qtAXp 
$port)
cnt_t=$(echo "select count(*) from pgbench_tellers" |psql -qtAXp 
$port)
cnt_h=$(echo "select count(*) from pgbench_history" |psql -qtAXp 
$port)
printf "$port a,b,t,h: %6d %6d %6d %6d" $cnt_a  $cnt_b  $cnt_t  
$cnt_h

echo -n "   $md5_a  $md5_b  $md5_t  $md5_h"
if   [[ $port -eq $port1 ]]; then echo "   master"
elif [[ $port -eq $port2 ]]; then echo "   replica"
else  echo " ERROR"
fi
  done
}


echo "
drop table if exists pgbench_accounts;
drop table if exists pgbench_branches;
drop table if exists pgbench_tellers;
drop table if exists pgbench_history;" | psql -X -p $port1 \
  && echo "
drop table if exists pgbench_accounts;
drop table if exists pgbench_branches;
drop table if exists pgbench_tellers;
drop table if exists pgbench_history;" | p

Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Karl O. Pinc
On Wed, 18 Jan 2017 10:11:20 -0600
"Karl O. Pinc"  wrote:

> You must write
>   errcode(ERRCODE_INTERNAL_ERROR)
> instead of just
>   ERRCODE_INTERNAL_ERROR
> 
> If you don't you get back 01000 for the error code.

> v26 patch attached which fixes this.

Attached are revised versions of the 2 (optional) patches which
make hardcoded constants into symbols to apply on
top of the v26 patch.


Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


patch_pg_current_logfile-v26.diff.abstract_guc_part1
Description: Binary data


patch_pg_current_logfile-v26.diff.abstract_guc_part2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2017-01-18 Thread Corey Huinker
On Wed, Jan 18, 2017 at 12:08 AM, Michael Paquier  wrote:

> On Wed, Jan 18, 2017 at 3:24 AM, Robert Haas 
> wrote:
> > On Sat, Jan 14, 2017 at 12:22 AM, Tom Lane  wrote:
> >>
> >> $ cat loop.sql
> >> \if :x < 1000
> >>   \echo :x
> >>   \set x :x + 1
> >>   \include loop.sql
> >> \fi
> >> $ psql --set x=0 -f loop.sql
> >>
> >> Somebody is going to think of that workaround for not having loops, and
> >> then whine about how psql runs out of file descriptors and/or stack.
> >
> > Hmm, I think somebody just DID think of it.
> >
> > But personally this doesn't upset me a bit.  If somebody complains
> > about that particular thing, I think that would be an excellent time
> > to suggest that they write a patch to add a looping construct.
>
> Agreed.
>
> As far as I can see on this thread, something could be done, it is
> just that we don't know yet at which extent things could be done with
> the first shot. There are many things that could be done, but at least
> I'd suggest to get \if, \fi and \quit to satisfy the first
> requirements of this thread, and let loops out of it. I have switched
> the patch as "returned with feedback" as getting a new patch is going
> to require some thoughts to get the context handling done correctly on
> psql side.
> --
> Michael
>

Fabien is pressed for time, so I've been speaking with him out-of-thread
about how I should go about implementing it.

The v1 patch will be \if , \elseif , \else, \endif, where 
will be naively evaluated via ParseVariableBool().

\ifs and \endifs must be in the same "file" (each MainLoop will start a new
if-stack). This is partly for sanity (you can see the pairings unless the
programmer is off in \gset meta-land), partly for ease  of design (data
structures live in MainLoop), but mostly because it would an absolute
requirement if we ever got around to doing \while.

I hope to have something ready for the next commitfest.

As for the fate of \quit_if, I can see it both ways. On the one hand, it's
super-simple, already written, and handy.

On the other hand, it's easily replaced by

\if 
\q
\endif


So I'll leave that as a separate reviewable patch.

As for loops, I don't think anyone was pushing for implementing \while now,
only to have a decision about what it would look like and how it would
work. There's a whole lot of recording infrastructure (the input could be a
stream) needed to make it happen. Moreover, I think \gexec scratched a lot
of the itches that would have been solved via a psql looping structure.


Re: [HACKERS] move collation import to backend

2017-01-18 Thread Tom Lane
Jeff Janes  writes:
> With this commit, I'm getting 'make check' fail at initdb with the error:

> 2017-01-18 07:47:50.565 PST [43691] FATAL:  collation "aa_ER@saaho" for
> encoding "UTF8" already exists

Yeah, so are large chunks of the buildfarm.  Having now read the patch,
I see that the problem is that it simply ignored the de-duplication
logic that existed in initdb's implementation.  That was put there
on the basis of bitter experience, as I recall.

The new code seems to think it's sufficient to do an "if not exists"
insertion when generating abbreviated names, but that's wrong, and
even if it avoided outright failures, it would be nondeterministic
(I doubt "locale -a" is guaranteed to emit locale names in any
particular order).

I think this needs to be reverted pending redesign of the de-duplication
coding.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-18 Thread Karl O. Pinc
On Wed, 18 Jan 2017 13:26:46 +0900
Michael Paquier  wrote:

> On Wed, Jan 18, 2017 at 11:36 AM, Karl O. Pinc  wrote:
> > On Wed, 18 Jan 2017 11:08:25 +0900
> > Michael Paquier  wrote:
> >  
> >> Our ideas rather map here, ERRCODE_INTERNAL_ERROR would be adapted
> >> for this situation. Do any of you want to give it a shot or should
> >> I?  

> What do you think about that?
> +   log_filepath = strchr(lbuffer, ' ');
> +   if (log_filepath == NULL)
> +   {
> +   /*
> +* Corrupted data found, return NULL to the caller and
> still
> +* inform him on the situation.
> +*/
> +   ereport(WARNING,
> +   (ERRCODE_INTERNAL_ERROR,
> +errmsg("corrupted data found in \"%s\"",
> +   LOG_METAINFO_DATAFILE)));
> +   break;
> +   }

You must write

  errcode(ERRCODE_INTERNAL_ERROR)

instead of just
  
  ERRCODE_INTERNAL_ERROR

If you don't you get back 01000 for the error code.
Test by using psql with '\set VERBOSITY verbose'.

v26 patch attached which fixes this.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 07afa3c..3188496 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4248,6 +4248,25 @@ SELECT * FROM parent WHERE key = 2400;
  must be enabled to generate
 CSV-format log output.

+   
+When either stderr or
+csvlog are included, the file
+current_logfiles gets created and records the location
+of the log file(s) currently in use by the logging collector and the
+associated logging destination. This provides a convenient way to
+find the logs currently in use by the instance. Here is an example of
+contents of this file:
+
+stderr pg_log/postgresql.log
+csvlog pg_log/postgresql.csv
+
+Note that this file gets updated when a new log file is created
+as an effect of rotation or when log_destination is
+reloaded.  current_logfiles is removed when
+stderr and csvlog
+are not included in log_destination or when the
+logging collector is disabled.
+   
 

 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index eb1b698..8524dff 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15441,6 +15441,13 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile(text)
+   text
+   Primary log file name, or log in the requested format,
+   currently in use by the logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15659,6 +15666,34 @@ SET search_path TO schema , schema, ..

 

+pg_current_logfile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of the log file(s) currently in use by the logging collector.
+The path includes the  directory
+and the log file name.  Log collection must be enabled or the return value
+is NULL.  When multiple log files exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply, as text,
+either csvlog or stderr as the value of the
+optional parameter. The return value is NULL when the
+log format requested is not a configured
+.
+pg_current_logfiles reflects the contents of the
+file current_logfiles.
+   
+
+   
 pg_my_temp_schema

 
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 5c52824..388ed34 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -61,6 +61,12 @@ Item
 
 
 
+ current_logfiles
+ File recording the log file(s) currently written to by the logging
+  collector
+
+
+
  global
  Subdirectory containing cluster-wide tables, such as
  pg_database
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 31aade1..1d7f68b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1055,6 +1055,7 @@ REVOKE EXECUTE ON FUNCTION pg_xlog_replay_pause() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_xlog_replay_resume() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_rotate_logfile() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_reload_conf() FROM public;
+REVOKE EXECUTE ON FUNCTION pg_current_logfile() FROM public;
 
 REVOKE EXECUTE ON FUNCTION pg_stat_reset() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_stat_reset_shared(text) FROM public;
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/sysl

Re: [HACKERS] Parallel Index Scans

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 8:03 AM, Amit Kapila  wrote:
> On Tue, Jan 17, 2017 at 11:27 PM, Robert Haas  wrote:
>> On Mon, Jan 16, 2017 at 7:11 AM, Amit Kapila  wrote:
>> WAIT_EVENT_PARALLEL_INDEX_SCAN is in fact btree-specific.  There's no
>> guarantee that any other AMs the implement parallel index scans will
>> use that wait event, and they might have others instead.  I would make
>> it a lot more specific, like WAIT_EVENT_BTREE_PAGENUMBER.  (Waiting
>> for the page number needed to continue a parallel btree scan to become
>> available.)
>
> WAIT_EVENT_BTREE_PAGENUMBER - NUMBER sounds slightly inconvenient. How
> about just WAIT_EVENT_BTREE_PAGE?  We can keep the description as
> suggested by you?

Sure.

>> Why do we need separate AM support for index_parallelrescan()?  Why
>> can't index_rescan() cover that case?
>
> The reason is that sometime index_rescan is called when we have to
> just update runtime scankeys in index and we don't want to reset
> parallel scan for that.  Refer ExecReScanIndexScan() changes in patch
> parallel_index_opt_exec_support_v4.  Rescan is called from below place
> during index scan.

Hmm, tricky.  OK, I'll think about that some more.

>> I think it's a bad idea to add a ParallelIndexScanDesc argument to
>> index_beginscan().  That function is used in lots of places, and
>> somebody might think that they are allowed to actually pass a non-NULL
>> value there, which they aren't: they must go through
>> index_beginscan_parallel.  I think that the additional argument should
>> only be added to index_beginscan_internal, and
>> index_beginscan_parallel should remain unchanged.
>
> If we go that way then we need to set few parameters like heapRelation
> and xs_snapshot in index_beginscan_parallel as we are doing in
> index_beginscan. Does going that way sound better to you?

It's pretty minor code duplication; I don't think it's an issue.

> I think there is value in retaining index_beginscan_parallel as that
> is parallel to heap_beginscan_parallel.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] move collation import to backend

2017-01-18 Thread Jeff Janes
On Tue, Jan 17, 2017 at 9:05 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 1/9/17 10:04 PM, Euler Taveira wrote:
> > On 18-12-2016 18:30, Peter Eisentraut wrote:
> >> Updated patch with that fix.
> >>
> > Peter, I reviewed and improved your patch.
> >
> > * I document the new function. Since collation is a database object, I
> > chose "Database Object Management Functions" section.
>
> OK
>
> > * I've added a check to any-encoding database because I got 'FATAL:
> > collation "C" already exists' on Debian 8, although, I didn't get on
> > CentOS 7. The problem is that Debian has two locales for C (C and
> > C.UTF-8) and CentOS has just one (C).
>
> OK
>
> > * I've added OidIsValid to test the new collation row.
>
> OK
>
> > * I've changed the parameter order. Schema seems more important than
> > if_not_exists. Also, we generally leave those boolean parameters for the
> > end of list. I don't turn if_not_exists optional but IMO it would be a
> > good idea (default = true).
>
> I put them that way because in an SQL command the "IF NOT EXISTS" comes
> before the schema, but I can see how it is weird that way in a function.
>
> > * You removed some #if and #ifdef while moving things around. I put it
> back.
> > * You didn't pgident some lines of code but I'm sure you didn't for a
> > small patch footprint.
>
> I had left the #if in initdb, but I think your changes are better.
>
> > I'm attaching the complete and also a patch at the top of your last
> patch.
>
> Thanks.  If there are no more comments, I will proceed with that.
>
>
With this commit, I'm getting 'make check' fail at initdb with the error:

2017-01-18 07:47:50.565 PST [43691] FATAL:  collation "aa_ER@saaho" for
encoding "UTF8" already exists
2017-01-18 07:47:50.565 PST [43691] STATEMENT:  SELECT
pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog');

My system:

CentOS release 6.8 (Final)
gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC)

./configure > /dev/null # no options

$ locale -a|fgrep aa_ER
aa_ER
aa_ER.utf8
aa_ER.utf8@saaho
aa_ER@saaho

I have no idea what @ means in a locale, I'm just relaying the information.

Cheers,

Jeff


Re: [HACKERS] pageinspect: Hash index support

2017-01-18 Thread Jesper Pedersen

Hi,

On 01/18/2017 04:54 AM, Ashutosh Sharma wrote:

Is there a reason for keeping the input arguments for
hash_bitmap_info() different from hash_page_items()?



Yes, there are two reasons behind it.

Firstly, we need metapage to identify the bitmap page that holds the
information about the overflow page passed as an input to this
function.

Secondly, we will have to input overflow block number as an input to
this function so as to determine the overflow bit number which can be
used further to identify the bitmap page.

+   /* Read the metapage so we can determine which bitmap page to use */
+   metabuf = _hash_getbuf(indexRel, HASH_METAPAGE, HASH_READ, LH_META_PAGE);
+   metap = HashPageGetMeta(BufferGetPage(metabuf));
+
+   /* Identify overflow bit number */
+   ovflbitno = _hash_ovflblkno_to_bitno(metap, ovflblkno);
+
+   bitmappage = ovflbitno >> BMPG_SHIFT(metap);
+   bitmapbit = ovflbitno & BMPG_MASK(metap);
+
+   if (bitmappage >= metap->hashm_nmaps)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("invalid overflow bit number %u", ovflbitno)));
+
+   bitmapblkno = metap->hashm_mapp[bitmappage];




As discussed off-list (along with my patch that included some of these 
fixes), it would require calculation from the user to be able to provide 
the necessary pages through get_raw_page().


Other ideas on how to improve this are most welcome.


Apart from above comments, the attached patch also handles all the
comments from Mithun.



Please, include a version number for your patch files in the future.

Fixed in this version:

* verify_hash_page: Display magic in hex, like hash_metapage_info
* Update header for hash_page_type

Moving the patch back to 'Needs Review'.

Best regards,
 Jesper

>From 8e5a68cfaf979bcab88fb9358b88a89bc780a277 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Wed, 18 Jan 2017 08:42:02 -0500
Subject: [PATCH] Add support for hash index in pageinspect contrib module v15

Authors: Ashutosh Sharma and Jesper Pedersen.
---
 contrib/pageinspect/Makefile  |  10 +-
 contrib/pageinspect/hashfuncs.c   | 574 ++
 contrib/pageinspect/pageinspect--1.5--1.6.sql |  76 
 contrib/pageinspect/pageinspect.control   |   2 +-
 doc/src/sgml/pageinspect.sgml | 148 +++
 src/backend/access/hash/hashovfl.c|   8 +-
 src/include/access/hash.h |   1 +
 7 files changed, 810 insertions(+), 9 deletions(-)
 create mode 100644 contrib/pageinspect/hashfuncs.c
 create mode 100644 contrib/pageinspect/pageinspect--1.5--1.6.sql

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 87a28e9..052a8e1 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -2,13 +2,13 @@
 
 MODULE_big	= pageinspect
 OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
-		  brinfuncs.o ginfuncs.o $(WIN32RES)
+		   brinfuncs.o ginfuncs.o hashfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.5.sql pageinspect--1.4--1.5.sql \
-	pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
+	pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 REGRESS = page btree brin gin
diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
new file mode 100644
index 000..f157fcc
--- /dev/null
+++ b/contrib/pageinspect/hashfuncs.c
@@ -0,0 +1,574 @@
+/*
+ * hashfuncs.c
+ *		Functions to investigate the content of HASH indexes
+ *
+ * Copyright (c) 2017, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		contrib/pageinspect/hashfuncs.c
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "access/htup_details.h"
+#include "catalog/pg_am.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/builtins.h"
+
+PG_FUNCTION_INFO_V1(hash_page_type);
+PG_FUNCTION_INFO_V1(hash_page_stats);
+PG_FUNCTION_INFO_V1(hash_page_items);
+PG_FUNCTION_INFO_V1(hash_bitmap_info);
+PG_FUNCTION_INFO_V1(hash_metapage_info);
+
+#define IS_HASH(r) ((r)->rd_rel->relam == HASH_AM_OID)
+
+/* 
+ * structure for single hash page statistics
+ * 
+ */
+typedef struct HashPageStat
+{
+	uint32		live_items;
+	uint32		dead_items;
+	uint32		page_size;
+	uint32		free_size;
+
+	/* opaque data */
+	BlockNumber hasho_prevblkno;
+	BlockNumber hasho_nextblkno;
+	Bucket		hasho_bucket;
+	uint16		hasho_flag;
+	uint16		hasho_page_id;
+} HashPageStat;
+
+
+/*
+ * Verify that the given bytea contains a HASH page, or die in the attempt.
+ * A pointer to the page is returned.

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-18 Thread Vladimir Rusinov
On Wed, Jan 18, 2017 at 12:28 PM, Michael Paquier  wrote:

> On Wed, Jan 18, 2017 at 8:15 PM, Vladimir Rusinov 
> wrote:
> > On the topic of binaries, there's going to be another patch renaming
> them.
> > Those will have no aliases as it's trivial to work-around (symlinks,
> shell
> > scripts, etc) and not so trivial to implement in a portable way.
>
> On Windows that would be a pain... So let's not use symlinks for binaries.


Just to be clear, I'm not proposing that we use symlinks. Just pointing out
that workarounds are trivial for the end users.

-- 
Vladimir Rusinov
Storage SRE, Google Ireland

Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland
Registered in Dublin, Ireland
Registration Number: 368047


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [HACKERS] References to arbitrary database objects that are suitable for pg_dump

2017-01-18 Thread Alvaro Herrera
Jim Nasby wrote:

> Is there anything that will translate an object identity (as returned by
> pg_event_trigger_ddl_commands) into class/object/subobject OIDs? If I had
> that, I could use it to track objects by OID (like pg_depend does), as well
> as storing the text representation. That way I can have an event trigger
> that updates the stored pg_describe_object() text any time an object was
> renamed. Without that, I don't see any good way to find the original name of
> a renamed object (so that I can update my reference to it).

> BTW, why were pg_get_object_address and pg_identify_object_as_address added
> instead of creating a function that accepted the output of
> pg_identify_object()? Is there a reason that wouldn't work?

I suppose it's because your use case is different than the one I was
trying to satisfy, so we have different ideas on what is necessary.

> ISTM it would be useful to add address_names and address_args to
> pg_event_trigger_ddl_commands(), as well as adding some way to get the
> original information for an object when an ALTER is being done. In
> particular, if something (such as a column) is being renamed.

Yeah, this info is available in the pg_ddl_command object, but you need
to write some C code to get to it, as the structs are specific to each
possible command.  I suppose we could have written another function such
as the one with have for drops (pg_event_trigger_dropped_objects).  As
far as I can tell we don't have any commands that rename more than one
object at a time, so it would be simpler, but I don't know how likely
that is to change.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2017-01-18 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Thanks for the review.  Attached is an updated patch that I believe
>> addresses all of the review comments so far.  The code is unchanged from
>> v2, but I improved the README, some comments, and the regression tests.

> I've reviewed your updates and they answer all of my comments and I
> appreciate the EC regression tests you added.

> I also agree with Dean's down-thread suggested regression test change.

Thanks for reviewing --- I'll do something with that test case and
push it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] emergency outage requiring database restart

2017-01-18 Thread Merlin Moncure
On Wed, Jan 18, 2017 at 4:11 AM, Ants Aasma  wrote:
> On Wed, Jan 4, 2017 at 5:36 PM, Merlin Moncure  wrote:
>> Still getting checksum failures.   Over the last 30 days, I see the
>> following.  Since enabling checksums FWICT none of the damage is
>> permanent and rolls back with the transaction.   So creepy!
>
> The checksums still only differ in least significant digits which
> pretty much means that there is a block number mismatch. So if you
> rule out filesystem not doing its job correctly and transposing
> blocks, it could be something else that is resulting in blocks getting
> read from a location that happens to differ by a small multiple of
> page size. Maybe somebody is racily mucking with table fd's between
> seeking and reading. That would explain the issue disappearing after a
> retry.
>
> Maybe you can arrange for the RelFileNode and block number to be
> logged for the checksum failures and check what the actual checksums
> are in data files surrounding the failed page. If the requested block
> number contains something completely else, but the page that follows
> contains the expected checksum value, then it would support this
> theory.

will do.   Main challenge is getting hand compiled server to swap in
so that libdir continues to work.  Getting access to the server is
difficult as is getting a maintenance window.  I'll post back ASAP.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-01-18 Thread Amit Kapila
On Wed, Jan 18, 2017 at 6:25 AM, Haribabu Kommi
 wrote:
>
>
> On Mon, Jan 16, 2017 at 11:11 PM, Amit Kapila 
> wrote:
>>
>
> + * index_beginscan_parallel - join parallel index scan
>
> The name and the description doesn't sync properly, any better description?
>

This can be called by both the worker and leader of parallel index
scan.  What problem do you see with it.  heap_beginscan_parallel has
similar description, so not sure changing here alone makes sense.


>
> +extern BlockNumber _bt_parallel_seize(IndexScanDesc scan, bool *status);
> +extern void _bt_parallel_release(IndexScanDesc scan, BlockNumber
> scan_page);
>
> Any better names for the above functions, as these function will provide/set
> the next page that needs to be read.
>

These functions also set the state of scan.  IIRC, these names were
being agreed between Robert and Rahila as well (suggested offlist by
Robert).  I am open to change if you or others have any better
suggestions.

>
> + /* reset (parallel) index scan */
> + if (node->iss_ScanDesc)
> + {
>
> Why this if check required? There is an assert check in later function
> calls.
>

This is required because we don't initialize the scan descriptor for
parallel-aware nodes during ExecInitIndexScan.  It got initialized
later at the time of execution when we initialize dsm.  Now, it is
quite possible that Gather node can occur on inner side of join in
which case Rescan can be called before even execution starts. This is
the reason why we have similar check in ExecReScanSeqScan which is
added during parallel sequential scans (f0661c4e). Does that answer
your question?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
I did a review pass over 0001 and 0002.  I think the attached updated
version is committable ... except for one thing.  The more I look at it,
the more disturbed I am by the behavioral change shown in rangefuncs.out
--- that's the SRF-in-one-arm-of-CASE issue.  (The changes in tsrf.out
are fine and as per agreement.)  We touched lightly on that point far
upthread, but didn't really resolve it.  What's bothering me is that
we're changing, silently, from a reasonably-intuitive behavior to a
completely-not-intuitive one.  Since we got a bug report for the previous
less-than-intuitive behavior for such cases, it's inevitable that we'll
get bug reports for this.  I think it'd be far better to throw error for
SRF-inside-a-CASE.  If we don't, we certainly need to document this,
and I'm not very sure how to explain it clearly.

Upthread we had put COALESCE in the same bucket, but I think that's less
of a problem, because in typical usages the SRF would be in the first
argument and so users wouldn't be expecting conditional evaluation.

Anyway, I've not done anything about that in the attached.  What I did do:

* Merge 0001 and 0002.  I appreciate you having separated that for my
review, but it doesn't make any sense to commit the parts of 0001 that
you undid in 0002.

* Rename the node types as per yesterday's discussion.

* Require Project to always have an input plan node.

* Obviously, ExecMakeFunctionResultSet can be greatly simplified now
that it need not deal with hasSetArg cases.  I saw you'd left that
for later, which is mostly fine, but I did lobotomize it just enough
to throw an error if it gets a set result from an argument.  Without
that, we wouldn't really be testing that the planner splits nested
SRFs correctly.

* This bit in ExecProjectSRF was no good:

+ else if (IsA(gstate->arg, FuncExprState) &&
+  ((FuncExpr *) gstate->arg->expr)->funcretset)

because FuncExprState is used for more node types than just FuncExpr;
in particular this would fail (except perhaps by accident) for a
set-returning OpExpr.  I chose to fix it by adding a funcReturnsSet
field to FuncExprState and insisting that ExecInitExpr fill that in
immediately, which it can do easily.

* Minor style and comment improvements; fix a couple small oversights
such as missing outfuncs.c support.

* Update the user documentation (didn't address the CASE issue, though).

regards, tom lane



use-project-set-for-tlist-srfs.patch.gz
Description: use-project-set-for-tlist-srfs.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-18 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Wed, Jan 18, 2017 at 10:35 AM, Michael Paquier
>  wrote:
> > On Wed, Jan 18, 2017 at 7:31 AM, Stephen Frost  wrote:
> >> Perhaps we need a way for pg_ctl to realize a cold-standby case and
> >> throw an error or warning if --wait is specified then, but that hardly
> >> seems like the common use-case.  It also wouldn't make any sense to have
> >> anything in the init system which depended on PG being up in such a case
> >> because, well, PG isn't ever going to be 'up'.
> >
> > Yeah, it seems to me that we are likely looking for a wait mode saying
> > to exit pg_ctl once Postgres is happily rejecting connections, because
> > that means that it is up and that it is sorting out something first
> > before accepting them. This would basically filter the states in the
> > control file that we find as acceptable if the connection test
> > continues complaining about PQPING_REJECT.
> 
> Another option would be as well to log the state of the control file
> to the user to let him know what currently happens, and document that
> increasing the wait timeout is recommended if the recovery time since
> the last redo point takes longer.

I was actually thinking about it the other way- start out by changing
them to both be 5m and then document next to checkpoint_timeout (and
max_wal_size, perhaps...) that if you go changing those parameters (eg:
bumping up checkpoint_timeout to 30 minutes and max_wal_size up enough
that you're still checkpointing based on time and not due to running out
of WAL space) then you might need to consider raising the timeout for
pg_ctl to wait around for the server to finish going through crash
recovery due to all of the outstanding changes since the last
checkpoint.

In particular, I like to think that will help people understand the
downsides of raising those values; I've run into a number of cases where
people seem to feel it's a win-win without any downside, but that isn't
really the case.

Thanks!

Stephen


signature.asc
Description: Digital signature


  1   2   >