date:20170918

Re: [HACKERS] Creating backup history files for backups taken from standbys

2017-09-18 Thread Masahiko Sawada

On Tue, Sep 19, 2017 at 8:33 AM, David Steele  wrote:
> On 9/18/17 7:26 PM, Michael Paquier wrote:
>> On Tue, Sep 19, 2017 at 8:14 AM, David Steele  wrote:
>>> On 8/31/17 11:56 PM, Michael Paquier wrote:
 Here is an updated patch with refreshed documentation, as a result of
 449338c which was discussed in thread
 https://www.postgresql.org/message-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8...@pgmasters.net.
 I am just outlining the fact that a backup history file gets created
 on standbys and that it is archived.
>>>
>>> The patch looks good overall.  It applies cleanly and passes all tests.
>>>
>>> One question.  Do we want to create this file all the time (as written),
>>> or only when archive_mode = always?
>>>
>>> It appears that CleanupBackupHistory() will immediately remove the
>>> history file when archive_mode != always so it seems useless to write
>>> it.  On the other hand the code is a bit simpler this way.  Thoughts?
>>
>> With archive_mode = off, the bytes of the backup history file are
>> still written to disk, so my take on the matter is to keep the code
>> simple.
>
> I'm OK with that.
>
> I'll give Masahiko some time to respond before marking it RFC.
>

Thanks, I'll review it and send a comment by this Wednesday.

-- 
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commits don't block for synchronous replication

2017-09-18 Thread Michael Paquier

On Tue, Sep 19, 2017 at 7:50 AM, Xin Zhang  wrote:
> If primary crashed at that moment, and failover to standby, the foo table is
> lost, even though the replication is synced according to
> `pg_stat_replication` view.

GUC parameters are reloaded each time a query is run, and so
SyncRepConfig is filled with the parsed data of SyncRepStandbyNames
once the parameter is reloaded for the process. Still, here, a commit
is waiting for a signal from a WAL sender that the wanted LSN has been
correctly flushed on a standby so this code path does not care about
the state of SyncRepConfig saved in the context of the process, we
want to know what the checkpointer thinks about it. Hence using WAL
sender data or sync_standbys_defined as a source of truth looks like a
correct concept to me, making the problem of this bug legit.

The check with SyncRepRequested() still holds truth: max_wal_senders
needs a restart to be updated. Also, the other caller of
SyncStandbysDefined() requires SyncRepConfig to be set, so this caller
is fine.

I have looked at your patch and tested it, but found no problems
associated with it. A backpatch would be required, so I have added an
entry in the next commit fest with status set to "ready for committer"
so as this bug does not fall into the cracks.

> A separate question, is the `pg_stat_replication` view the reliable way to
> find when to failover to a standby, or there are some other ways to ensure
> the standby is in-sync with the primary?

It shows at SQL level what is currently present in shared memory by
scanning all the WAL sender entries, so this report uses the same data
as the backend themselves, so that's a reliable source. In Postgres
10, pg_stat_activity is also able to show to users what are the
backends waiting for a change to be flushed/applied on the standby
using the wait event called SyncRep. You could make some use of that
as well.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test

2017-09-18 Thread Michael Paquier

On Thu, Sep 7, 2017 at 11:14 AM, Michael Paquier
 wrote:
> Or we could make upgradecheck a noop, then remove it once all the MSVC
> animals have upgraded to a newer version of the buildfarm client which
> does not use upgradecheck anymore (I am fine to send a patch or a pull
> request to Andrew for that).

This patch is logged as "waiting on author" in the current commit
fest, but any new patch will depend on the feedback that any other
hacker has to offer based on the set of ideas I have posted upthread.
Hence I am yet unsure what is the correct way to move things forward.
So, any opinions? Peter or others?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-09-18 Thread Vaishnavi Prabakaran

Hi,

On Mon, Aug 14, 2017, Michael Paquier  wrote:
>Attached is a set of 3 patches:

I tried to review the patch and firstly patch applies cleanly without any
noise.

>- 0001 removes ALLOW_DANGEROUS_LO_FUNCTIONS

I think it is better to remove "ALLOW_DANGEROUS_LO_FUNCTIONS" from
pg_config_manual.h as well.

>- 0002 replaces the superuser checks with GRANT permissions
>- 0003 refactors the LO code to improve the ACL handling. Note that
>this patch introduces a more debatable change: now there is no
>distinction between INV_WRITE and INV_WRITE|INV_READ, as when
>INV_WRITE is used INV_READ is implied. The patch as proposed does a
>complete split of both permissions to give a system more natural and
>more flexible. The current behavior is documented.

>@@ -163,22 +150,16 @@ lo_read(int fd, char *buf, int len)
> 
> + if ((lobj->flags & IFS_RDLOCK) == 0)
>+ ereport(ERROR,
>+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>+ errmsg("large object descriptor %d was not opened for reading",
>+ fd)));

Do we ever reach this error? Because per my understanding

1) if the application didn't call lo_open before lo_read, then file
descriptor validation in the beginning of this function should capture it.

2) if the application called lo_open only with write mode should be able to
read as well per documentation.

Ok, in the inv_open(),  IFS_RDLOCK is set only when explicit READ mode is
requested. So, it causes lo_read failure when large-object is opened in
WRITE mode? I think documented behavior is more appropriate and post ACL
checks for select and update permissions in inv_open(), IFS_RDLOCK flag
should be set regardless of mode specified.

Thanks & Regards,
Vaishnavi,
Fujitsu Australia.

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-18 Thread Michael Paquier

On Tue, Sep 19, 2017 at 12:57 PM, Michael Paquier
 wrote:
> I'd think about adjusting the comments the proper way for each AM so
> as one can read those comments and catch any limitation immediately.
> The fact this has not been pointed out on this thread before any
> checks and the many mails exchanged on the matter on this thread make
> me think that the current code does not outline the current code
> properties appropriately.

Another thing that we could consider as well is adding an assertion in
XLogRegisterBuffer & friends so as the combination of REGBUF_STANDARD
and REGBUF_NO_IMAGE is forbidden. That's bugging me as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-18 Thread Michael Paquier

On Tue, Sep 19, 2017 at 12:40 PM, Amit Kapila  wrote:
> On Mon, Sep 18, 2017 at 4:03 PM, Michael Paquier
>  wrote:
>> On Mon, Sep 18, 2017 at 11:16 AM, Amit Kapila  
>> wrote:
>>> On Sat, Sep 16, 2017 at 7:15 PM, Michael Paquier
>>>  wrote:

>>>
>>> You have already noticed above that it will help when
>>> wal_checking_consistency is used and that was the main motivation to
>>> pass REGBUF_STANDARD apart from maintaining consistency.  It is not
>>> clear to me what is bothering you.  If your only worry about these
>>> patches is that you want this sentence to be removed from the comment
>>> because you think it is obvious or doesn't make much sense, then I
>>> think we can leave this decision to committer.  I have added it based
>>> on Tom's suggestion above thread about explaining why it is
>>> inessential or essential to set pd_lower.  I think Amit Langote just
>>> tried to mimic what I have done in hash and btree patches to maintain
>>> consistency.  I am also not very sure if we should write some detailed
>>> comment or leave the existing comment as it is.  I think it is just a
>>> matter of different perspective.
>>
>> What is disturbing me a bit is that the existing comments mention
>> something that could be supported (the compression of pages), but
>> that's actually not done and this is unlikely to happen because the
>> number of bytes associated to a meta page is going to be always
>> cheaper than a FPW, which would cost in CPU to store it for
>> compression is enabled. So I think that we should switch comments to
>> mention that pd_lower is set so as this helps with page masking, but
>> we don't take advantage of XLOG compression in the code.
>>
>
> I think that is not true because we do need FPW for certain usages of
> metapage.  Consider a case in _hash_doinsert where register metabuf
> with just
> REGBUF_STANDARD, it can take advantage of removing the hole if
> pd_lower is set to its correct position.

I am not saying that no index AMs take advantage FPW compressibility
for their meta pages. There are cases like this one, as well as one
code path in BRIN where this is useful, and it is useful as well when
logging FPWs of the init forks for unlogged relations.

> There are other similar
> usages in hash index. For other indexes like btree, there is no such
> usage currently, but it can also take advantage for
> wal_consistency_checking.   Now, probably there is an argument that we
> use different comments for different indexes as the usage varies, but
> I think someone looking at code after reading the comments can
> differentiate such cases.

I'd think about adjusting the comments the proper way for each AM so
as one can read those comments and catch any limitation immediately.
The fact this has not been pointed out on this thread before any
checks and the many mails exchanged on the matter on this thread make
me think that the current code does not outline the current code
properties appropriately.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace

2017-09-18 Thread Michael Paquier

On Thu, Sep 14, 2017 at 2:17 AM, Pierre Ducroquet  wrote:
> All my apologies for the schockingly long time with no answer on this topic.

No problem. That's the concept called life I suppose.

> I will do my best to help review some patches in the current CF.

Thanks for the potential help!

> Attached to this email is the new version of the patch, checked against HEAD
> and REL_10_STABLE, with the small change required by Michael (affect true/
> false to the boolean instead of 1/0) applied.

Thanks for the new version.

So I have been pondering about this patch, and allowing multiple
versions of Postgres to run in the same tablespace is mainly here for
upgrade purposes, so I don't think that we should encourage such
practices for performance reasons. There is as well
--tablespace-mapping which is here to cover a similar need and we know
that this solution works, at least people have spent time to make sure
it does.

Another problem that I have with this patch is that on HEAD,
permission checks are done before receiving any data. I think that
this is really saner to do any checks like that first, so as you can
know if the tablespace is available for write access before writing
any data to it. With this patch, you may finish by writing a bunch of
data to a tablespace path, but then fail on another tablespace because
of permission issues so the backup becomes useless after transferring
and writing potentially hundreds of gigabytes. This is no fun to
detect that potentially too late, and can impact the responsiveness of
instances to get backups and restore from them.

All in all, this patch gets a -1 from me in its current shape. If
Horiguchi-san or yourself want to process further with this patch, of
course feel free to do so, but I don't feel that we are actually
trying to solve a new problem here. I am switching the patch to
"waiting on author".

Here is a nitpick about the patch by the way:
+   if (firstfile && !basetablespace)
+   {
+   /*
+* The first file in the tablespace is its
main folder, whose name can not be guessed (PG_MAJORVER_CATVER)
+* So we must check here that this folder can
be created or is empty.
+*/
+   verify_dir_is_empty_or_create(filename,
_tablespace_dirs, _tablespace_dirs);
+   }
+   else if (mkdir(filename, S_IRWXU) != 0)
This patch has formatting problems. You should try to run a pgindent
run before submitting it. The code usually needs to fit within the
72-80 character window.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-09-18 Thread Pavel Stehule

2017-09-11 14:44 GMT+02:00 Tom Lane :

> Peter Eisentraut  writes:
> > On 9/8/17 13:14, Simon Riggs wrote:
> >> 2. Allow a SET to apply only for a single statement
> >> SET guc1 = x, guc2 = y FOR stmt
> >> e.g. SET max_parallel_workers = 4 FOR SELECT count(*) FROM bigtable
> >> Internally a GUC setting already exists for a single use, via
> >> GUC_ACTION_SAVE, so we just need to invoke it.
>
> > This doesn't read well to me.  It indicates to me "make this setting for
> > this query [in case I run it later]", but it does not indicate that the
> > query will be run.
>
> Robert's original LET ... IN ... syntax proposal might be better from that
> standpoint.
>

>From my perspective Robert's proposal is not targeted to PLpgSQL well,
because it doesn't allow to choose granularity.

I am not sure what is result from this discussion:

1. this feature is wanted

2. a opened question is the syntax

I am sure so GUC are not a good design solution for PL/pgSQL. Robert's
proposal does thing  bit better, but it has sense more for another
environments than PLpgSQL - more, it allows more degree of freedom what has
sense for PLpgSQL.

There is possibility to introduce new compile option #option to disable
plan cache on function scope. Do you think so it is acceptable solution? It
is step forward.

Regards

Pavel

> regards, tom lane
>

Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-09-18 Thread Fabien COELHO



Hello Masahiko-san,


Attached the latest version patch incorporated the tap tests.
Please review it.


Patch applies, compilation & make check ok.

Tests are simple and provide good coverage of new functionalities.

I would suggest to add '--unlogged-tables' so speedup the tests a little.

Comment: "# Custom initialization option, including a space"... ISTM that 
there is no space. Space is tested in the next test because of the v's and 
the --no-vacuum which turned them into space, which is enough.


Regex are just check for the whole output, so putting twice "qr{vacuum}" 
does not check that vacuum appears twice, it checks twice that vacuum 
appears once. I do not think that it is worth trying to check for the v 
repetition, so I suggest to remove one from the first test. Repetition of 
' ' is checked with the second test.


Maybe you could check that the data generation message is there.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-18 Thread Amit Kapila

On Mon, Sep 18, 2017 at 4:03 PM, Michael Paquier
 wrote:
> On Mon, Sep 18, 2017 at 11:16 AM, Amit Kapila  wrote:
>> On Sat, Sep 16, 2017 at 7:15 PM, Michael Paquier
>>  wrote:
>>>
>>
>> You have already noticed above that it will help when
>> wal_checking_consistency is used and that was the main motivation to
>> pass REGBUF_STANDARD apart from maintaining consistency.  It is not
>> clear to me what is bothering you.  If your only worry about these
>> patches is that you want this sentence to be removed from the comment
>> because you think it is obvious or doesn't make much sense, then I
>> think we can leave this decision to committer.  I have added it based
>> on Tom's suggestion above thread about explaining why it is
>> inessential or essential to set pd_lower.  I think Amit Langote just
>> tried to mimic what I have done in hash and btree patches to maintain
>> consistency.  I am also not very sure if we should write some detailed
>> comment or leave the existing comment as it is.  I think it is just a
>> matter of different perspective.
>
> What is disturbing me a bit is that the existing comments mention
> something that could be supported (the compression of pages), but
> that's actually not done and this is unlikely to happen because the
> number of bytes associated to a meta page is going to be always
> cheaper than a FPW, which would cost in CPU to store it for
> compression is enabled. So I think that we should switch comments to
> mention that pd_lower is set so as this helps with page masking, but
> we don't take advantage of XLOG compression in the code.
>

I think that is not true because we do need FPW for certain usages of
metapage.  Consider a case in _hash_doinsert where register metabuf
with just
REGBUF_STANDARD, it can take advantage of removing the hole if
pd_lower is set to its correct position.  There are other similar
usages in hash index. For other indexes like btree, there is no such
usage currently, but it can also take advantage for
wal_consistency_checking.   Now, probably there is an argument that we
use different comments for different indexes as the usage varies, but
I think someone looking at code after reading the comments can
differentiate such cases.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c oblivion of worker-startup failures

2017-09-18 Thread Amit Kapila

On Mon, Sep 18, 2017 at 10:00 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> Attached patch fixes these problems.
>
> Hmm, this patch adds a kill(notify_pid) after one call to
> ForgetBackgroundWorker, but the postmaster has several more such calls.
> Shouldn't they all notify the notify_pid?  Should we move that
> functionality into ForgetBackgroundWorker itself, so we can't forget
> it again?
>

Among other places, we already notify during
ReportBackgroundWorkerExit().  However, it seems to me that all other
places except where this patch has added a call to notify doesn't need
such a call.  The other places like in DetermineSleepTime and
ResetBackgroundWorkerCrashTimes is called for a crashed worker which I
don't think requires notification to the backend as that backend
itself would have restarted.  The other place where we call
ForgetBackgroundWorker is in maybe_start_bgworkers when rw_terminate
is set to true which again seems to be either the case of worker crash
or when someone has explicitly asked to terminate the worker in which
case we already send a notification.

I think we need to notify the backend on start, stop and failure to
start a worker.  OTOH, if it is harmless to send a notification even
after the worker is crashed, then we can just move that functionality
into ForgetBackgroundWorker itself as that will simplify the code and
infact that is the first thing that occurred to me as well, but I
haven't done that way as I was not sure if we want to send
notification in all kind of cases.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench tap tests & minor fixes.

2017-09-18 Thread Andres Freund

On 2017-09-11 15:02:21 -0400, Andrew Dunstan wrote:
> 
> 
> On 09/11/2017 01:58 PM, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> On 09/08/2017 09:40 AM, Tom Lane wrote:
> >>> Like you, I'm a bit worried about the code for extracting an exit
> >>> status from IPC::Run::run.  We'll have to keep an eye on the buildfarm
> >>> for a bit.  If there's any trouble, I'd be inclined to drop it down
> >>> to just success/fail rather than checking the exact exit code.
> >> bowerbird seems to have been made unhappy.
> > I saw that failure, but it appears to be a server-side crash:
> >
> > 2017-09-10 19:39:03.395 EDT [1100] LOG:  server process (PID 11464) was 
> > terminated by exception 0xC005
> > 2017-09-10 19:39:03.395 EDT [1100] HINT:  See C include file "ntstatus.h" 
> > for a description of the hexadecimal value.
> >
> > Given the lack of any log outputs from process 11464, it's hard to tell
> > what it was doing, but it seems not to be any of the backends running
> > pgbench queries.  So maybe an autovac worker?  I dunno.  Anyway, it's
> > difficult to credit that this commit caused the failure, even if it did
> > happen during the new test case.  I'm inclined to write it off as another
> > one of the random crashes that bowerbird seems prone to.
> >
> > If the failure proves repeatable, then of course we'll need to look
> > more closely.
> >
> > 
> 
> 
> Hmm, it had several failures and now a success. Will keep an eye on it.

There just was another failure like that
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird=2017-09-19%2001%3A42%3A20
I first thought it might be the new recovery tests, or the changes
leading to its addition, but it's a different test and in the middle of
the run.  Even so, I'd have looked towards my commit, except that
there's a number of previous reports that look similar.

Any chance you could get backtraces on these?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statement timeout behavior in extended queries

2017-09-18 Thread Tatsuo Ishii

> On 2017-09-10 17:12:19 -0700, Andres Freund wrote:
>> On 2017-09-11 09:10:49 +0900, Tatsuo Ishii wrote:
>> > If you don't mind, can you please commit/push the patch?
>> 
>> Ok, will do so.
> 
> And, done.  Thanks for patch and reminder!

Thanks!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statement timeout behavior in extended queries

2017-09-18 Thread Andres Freund

On 2017-09-10 17:12:19 -0700, Andres Freund wrote:
> On 2017-09-11 09:10:49 +0900, Tatsuo Ishii wrote:
> > If you don't mind, can you please commit/push the patch?
> 
> Ok, will do so.

And, done.  Thanks for patch and reminder!

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SERIALIZABLE with parallel query

2017-09-18 Thread Haribabu Kommi

On Tue, Sep 19, 2017 at 11:42 AM, Thomas Munro <
thomas.mu...@enterprisedb.com> wrote:

> On Fri, Sep 1, 2017 at 5:11 PM, Thomas Munro
>  wrote:
> > On Wed, Jun 28, 2017 at 11:21 AM, Thomas Munro
> >  wrote:
> >> [ssi-parallel-v5.patch]
> >
> > Rebased.
>
> Rebased again.
>

During testing of this patch, I found some behavior difference
with the support of parallel query, while experimenting with the provided
test case in the patch.

But I tested the V6 patch, and I don't think that this version contains
any fixes other than rebase.

Test steps:

CREATE TABLE bank_account (id TEXT PRIMARY KEY, balance DECIMAL NOT NULL);
INSERT INTO bank_account (id, balance) VALUES ('X', 0), ('Y', 0);

Session -1:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT balance FROM bank_account WHERE id = 'Y';

Session -2:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SET max_parallel_workers_per_gather = 2;
SET force_parallel_mode = on;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
set min_parallel_table_scan_size = 0;
set enable_indexscan = off;
set enable_bitmapscan = off;

SELECT balance FROM bank_account WHERE id = 'X';

Session -1:

update bank_account set balance = 10 where id = 'X';

Session -2:

update bank_account set balance = 10 where id = 'Y';
ERROR:  could not serialize access due to read/write dependencies among
transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.
HINT:  The transaction might succeed if retried.

Without the parallel query of select statement in session-2,
the update statement in session-2 is passed.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-09-18 Thread Masahiko Sawada

On Fri, Sep 8, 2017 at 9:52 AM, Masahiko Sawada  wrote:
> On Thu, Sep 7, 2017 at 4:15 PM, Fabien COELHO  wrote:
>>
 Very very minor comments that I should have noticed before, sorry for
 this
 additional round trip.
>>>
>>>
>>> Thank you for the dedicated review!
>>
>>
>> I'm someone at times pigheaded, I think in the good sense if it is possible,
>> and I like to finish what I start:-)
>>
>> Patch applies, compiles, works, everything is fine from my point of view.
>>
>> I switched it to "Ready for Committer".
>
> Thanks.
>
>>
>> Again, if the pgbench tap test patch get through, it should be tap tested.
>>
>
> Thank you for the remainder, I'll add tap tests once the patch got committed.
>

Attached the latest version patch incorporated the tap tests.
Please review it.

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgbench_custom_initialization_v13.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SERIALIZABLE with parallel query

2017-09-18 Thread Thomas Munro

On Fri, Sep 1, 2017 at 5:11 PM, Thomas Munro
 wrote:
> On Wed, Jun 28, 2017 at 11:21 AM, Thomas Munro
>  wrote:
>> [ssi-parallel-v5.patch]
>
> Rebased.

Rebased again.

-- 
Thomas Munro
http://www.enterprisedb.com


ssi-parallel-v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small code improvement for btree

2017-09-18 Thread Masahiko Sawada

On Tue, Sep 19, 2017 at 5:39 AM, Tom Lane  wrote:
> Doug Doole  writes:
>> Looks good to me.
>> The new status of this patch is: Ready for Committer
>
> Pushed.  Grepping found a few more places that should be changed to
> use these macros rather than referencing btpo_flags directly,
> so I did that.

Thank you!

> I tend to agree with Alvaro that it'd be better to get rid of
> BT_READ/BT_WRITE in favor of using the same buffer flags used
> elsewhere; but I also agree that as long as they're there we
> should use them, so I included that change as well.
>

Agreed. Thanks, again.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] valgrind vs. shared typmod registry

2017-09-18 Thread Andres Freund

Hi,

On 2017-09-18 18:04:36 +1200, Thomas Munro wrote:
> On Mon, Sep 18, 2017 at 5:39 PM, Thomas Munro
>  wrote:
> > Here is a patch to fix that.
> 
> Here's a better one (same code, corrected commit message).

Pushed. For a second I was tempted to also replace the
palloc(sizeof(dshash_table)) with a palloc0 - but in the end it seems
actually not too bad either to be able to catch bugs like this with some
help. If you have a strong opinion either way...

Thanks,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Robert Haas

On Mon, Sep 18, 2017 at 5:13 PM, Peter Geoghegan  wrote:
> On Mon, Sep 18, 2017 at 2:07 PM, Robert Haas  wrote:
>> On Mon, Sep 18, 2017 at 1:29 PM, Tom Lane  wrote:
>>> Uh, why does the planner need to be involved at all?
>>
>> Because it loses if the Bloom filter fails to filter anything.  That's
>> not at all far-fetched; consider SELECT * FROM a.x, b.x WHERE a.x =
>> b.x given a foreign key on a.x referencing b(x).
>
> Wouldn't a merge join be a lot more likely in this case anyway? Low
> selectivity hash joins with multiple batches are inherently slow; the
> wasted overhead of using a bloom filter may not matter.
>
> Obviously this is all pretty speculative. I suspect that this could be
> true, and it seems worth investigating that framing of the problem
> first.

ISTR Tomas Vondra doing some experiments with this a few years ago and
finding that it was, in fact, a problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-09-18 Thread Peter Geoghegan

On Wed, Apr 5, 2017 at 3:50 PM, Andres Freund  wrote:
> Hi,
>
> On 2017-04-01 03:05:07 +0900, Masahiko Sawada wrote:
>> On Fri, Mar 31, 2017 at 11:44 PM, Robert Haas  wrote:
>> [ lots of valuable discussion ]
>
> I think this patch clearly still is in the design stage, and has
> received plenty feedback this CF.  I'll therefore move this to the next
> commitfest.

Does anyone have ideas on a way forward here? I don't, but then I
haven't thought about it in detail in several months.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating backup history files for backups taken from standbys

2017-09-18 Thread David Steele

On 9/18/17 7:26 PM, Michael Paquier wrote:
> On Tue, Sep 19, 2017 at 8:14 AM, David Steele  wrote:
>> On 8/31/17 11:56 PM, Michael Paquier wrote:
>>> Here is an updated patch with refreshed documentation, as a result of
>>> 449338c which was discussed in thread
>>> https://www.postgresql.org/message-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8...@pgmasters.net.
>>> I am just outlining the fact that a backup history file gets created
>>> on standbys and that it is archived.
>>
>> The patch looks good overall.  It applies cleanly and passes all tests.
>>
>> One question.  Do we want to create this file all the time (as written),
>> or only when archive_mode = always?
>>
>> It appears that CleanupBackupHistory() will immediately remove the
>> history file when archive_mode != always so it seems useless to write
>> it.  On the other hand the code is a bit simpler this way.  Thoughts?
> 
> With archive_mode = off, the bytes of the backup history file are
> still written to disk, so my take on the matter is to keep the code
> simple.

I'm OK with that.

I'll give Masahiko some time to respond before marking it RFC.

Regards,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating backup history files for backups taken from standbys

2017-09-18 Thread Michael Paquier

On Tue, Sep 19, 2017 at 8:14 AM, David Steele  wrote:
> On 8/31/17 11:56 PM, Michael Paquier wrote:
>> Here is an updated patch with refreshed documentation, as a result of
>> 449338c which was discussed in thread
>> https://www.postgresql.org/message-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8...@pgmasters.net.
>> I am just outlining the fact that a backup history file gets created
>> on standbys and that it is archived.
>
> The patch looks good overall.  It applies cleanly and passes all tests.
>
> One question.  Do we want to create this file all the time (as written),
> or only when archive_mode = always?
>
> It appears that CleanupBackupHistory() will immediately remove the
> history file when archive_mode != always so it seems useless to write
> it.  On the other hand the code is a bit simpler this way.  Thoughts?

With archive_mode = off, the bytes of the backup history file are
still written to disk, so my take on the matter is to keep the code
simple.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating backup history files for backups taken from standbys

2017-09-18 Thread David Steele

Hi Michael,

On 8/31/17 11:56 PM, Michael Paquier wrote:

> Here is an updated patch with refreshed documentation, as a result of
> 449338c which was discussed in thread
> https://www.postgresql.org/message-id/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8...@pgmasters.net.
> I am just outlining the fact that a backup history file gets created
> on standbys and that it is archived.

The patch looks good overall.  It applies cleanly and passes all tests.

One question.  Do we want to create this file all the time (as written),
or only when archive_mode = always?

It appears that CleanupBackupHistory() will immediately remove the
history file when archive_mode != always so it seems useless to write
it.  On the other hand the code is a bit simpler this way.  Thoughts?

Regards,
-- 
-David
da...@pgmasters.net

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Commits don't block for synchronous replication

2017-09-18 Thread Xin Zhang

By setting up synchronized replication between primary and standby, commits
on primary should be blocked by function `SyncRepWaitForLSN()` until its
effects replicated to the standby.

Currently, `SyncRepWaitForLSN()` fast exits if the `SyncStandbysDefined()`
is false, which means GUC `synchronous_standby_names` is empty.

There is a race condition where the GUC is set but not reflected by some
backends. However, the `pg_stat_replication` `sync_state` is `sync`, and
the `state` is `streaming`, which means the commit should block and get
replicated to the mirror before it returns.

The proposed fix is to NOT do the fast exit based on the GUC, but only rely
on `WalSndCtl->sync_standbys_defined`.

Here is a quick repro:
- setup async replication
- create a backend
- set breakpoint at SyncRepWaitForLSN()
- Issue a ddl like `create table foo(c int)`
- Then the backend will be break at breakpoint SyncRepWaitForLSN()
- Set the GUC `synchronous_standby_names` to '*' in `postgresql.conf` to
trigger the synchronous replication.
- `pg_ctl` reload to signal the backend to reload the conf
- Check the `pg_stat_replication` to ensure `state` is `streaming` and
`sync_state` is `sync`.
- In this case, we expect the DDL is blocked.
- Then, step through the breakpoint until line:
```
-> 163 if (!SyncRepRequested() || !SyncStandbysDefined())
   164 return;
```
- Check the content of the GUC as well as the
`WalSndCtl->sync_standbys_defined`
```
(lldb) p SyncRepStandbyNames
(char *) $1 = 0x7fae31c03a80 ""
(lldb) p WalSndCtl->sync_standbys_defined
(bool) $2 = '\x01'
(lldb)
```
- As you can see the GUC is still not reflect with new content '*', but
the `sync_standbys_defined` is already changed to `1` by checkpointer
process.
- If we continue, the function will return, means the foo table will only
be created on primary, not on the standby.

If primary crashed at that moment, and failover to standby, the foo table
is lost, even though the replication is synced according to
`pg_stat_replication` view.

This is the patch we suggest as the fix:

```
diff --git a/src/backend/replication/syncrep.c
b/src/backend/replication/syncrep.c
index 8677235411..962772ef8f 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -156,11 +156,9 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
mode = Min(SyncRepWaitMode, SYNC_REP_WAIT_FLUSH);

/*
-* Fast exit if user has not requested sync replication, or there
are no
-* sync replication standby names defined. Note that those standbys
don't
-* need to be connected.
+* Fast exit if user has not requested sync replication.
 */
-   if (!SyncRepRequested() || !SyncStandbysDefined())
+   if (!SyncRepRequested())
return;

Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
```

A separate question, is the `pg_stat_replication` view the reliable way to
find when to failover to a standby, or there are some other ways to ensure
the standby is in-sync with the primary?

Thanks,
Xin Zhang, Ashwin Agrawal, and Asim R Praveen

[HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-18 Thread Peter Geoghegan

As I pointed out a couple of times already [1], we don't currently
sanitize ICU's BCP 47 language tags within CREATE COLLATION. CREATE
COLLATION will accept literally any string as a language tag as things
stand, even when the string is unambiguously bogus. While I accept
that there are limits on how far you can take sanitizing the BCP 47
tag format, due to its extensibility and "best effort" emphasis on
forward and backward compatibility, we can and should do more here,
IMHO. We should at least do the bare minimum, which has no possible
downside, and some notable upsides.

If I hack the CREATE COLLATION code to put any language tag string
provided by the user through the same sanitization process that initdb
already puts ICU language tags through, then we do much better. CREATE
COLLATION rejects syntax errors, which seems desirable:

postgres=# CREATE COLLATION test1 (provider = icu, locale = 'en-x-icu');
CREATE COLLATION
postgres=# CREATE COLLATION test2 (provider = icu, locale = 'foo bar baz');
ERROR:  XX000: could not convert locale name "foo bar baz" to language
tag: U_ILLEGAL_ARGUMENT_ERROR
LOCATION:  get_icu_language_tag, collationcmds.c:454
postgres=# CREATE COLLATION test3 (provider = icu, locale = 'en-gb-icu');
ERROR:  XX000: could not convert locale name "en-gb-icu" to language
tag: U_ILLEGAL_ARGUMENT_ERROR
LOCATION:  get_icu_language_tag, collationcmds.c:454
postgres=# CREATE COLLATION test4 (provider = icu, locale = 'not-a-country');
CREATE COLLATION

(To be more specific, I'm calling
get_icu_language_tag()/uloc_toLanguageTag() [2] as an extra step for
CREATE COLLATION here.)

It's not like the current behavior is a disaster, or that the
alternative behavior that I propose is perfect. The collation behavior
you end up with today, having specified a language tag with a syntax
error is the totally generic base Ducet collation behavior. Using 'foo
bar baz' is effectively the same as using the preinstalled 'und-x-icu'
collation, which I'm pretty sure is the same as using any current
English locale anyway. That said, falling back on the most useful
collation behavior based on inferences about the language tag is
supposed to be about rolling with the complexities of
internationalization, like political changes that are not yet
accounted for by the CLDR/ICU version, and so on. It's no
justification for not letting the user know when they've fat fingered
a language tag, which they could easily miss. These are *syntax*
errors.

At one point a couple of months back, it was understood that
get_icu_language_tag() might not always work with (assumed) valid
locale names -- that is at least the impression that the commit
message of eccead9 left me with. But, that was only with ICU 4.2, and
in any case we've since stopped creating keyword variants at initdb
time for other reasons (see 2bfd1b1 for details of those other
reasons). I tend to think that we should not install any language tag
that uloc_toLanguageTag() does not accept as valid on general
principle (so not just at initdb time, when it's actually least
needed).

Thoughts? I can write a patch for this, if that helps. It should be
straightforward.

[1] 
postgr.es/m/cah2-wzm22vtxvd-e1oz90de8z_m61_8amhsdozf1pwrkfrm...@mail.gmail.com
[2] 
https://ssl.icu-project.org/apiref/icu4c/uloc_8h.html#a1d50c91925ca3853fce6f28cf7390c3c
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres 9.6 Logical and Fisical replication

2017-09-18 Thread guedim

Hi guys

I am working with Postgres9.6 with a Master/Slave cluster replication using
Streaming replication.
I would like to add a new Slave server database but this database with
logical replication .


I tried with some configurations but it was not possible  :(

https://github.com/guedim/postgres-streaming-replication


Here is the image of what is in my mind: 
 

Thanks for any help!



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for pg_basebackup argument parsing

2017-09-18 Thread Pierre Ducroquet

On Monday, September 18, 2017 5:13:38 PM CEST Tom Lane wrote:
> Ryan Murphy  writes:
> > Looked thru the diffs and it looks fine to me.
> > Changing a lot of a integer/long arguments that were being read directly
> > via atoi / atol to be read as strings first, to give better error
> > handling.
> > 
> > This looks good to go to me.  Reviewing this as "Ready for Committer".
> > Thanks for the patch, Pierre!
> 
> I took a quick look through this and had a few thoughts:

I agree with your suggestions, I will try to produce a new patch before the 
end of the week.

> * If we're taking the trouble to sanity-check the input, I think we should
> also check for ERANGE (i.e., overflow).
> 
> * I concur with Robert that you might as well just test for "*endptr !=
> '\0'" instead of adding a strlen() call.
> 
> * Replicating fiddly little code sequences like this all over the place
> makes me itch.  There's too much chance to get it wrong, and even more
> risk of someone taking shortcuts.  It has to be not much more complicated
> than using atoi(), or we'll just be doing this exercise again in a few
> years.  So I'm thinking we need to introduce a convenience function,
> perhaps located in src/common/, or else src/fe_utils/.
> 
> * Changing variables from int to long int is likely to have unpleasant
> consequences on platforms where they're different; or at least, I don't
> particularly feel like auditing the patch to ensure that that's not a
> problem anywhere.  So I think we should not do that but just make the
> convenience function return int, with a suitable overflow test for that
> result size.  There's existing logic you can copy in
> src/backend/nodes/read.c:
> 
> errno = 0;
> val = strtol(token, , 10);
> if (endptr != token + length || errno == ERANGE
> #ifdef HAVE_LONG_INT_64
> /* if long > 32 bits, check for overflow of int4 */
> 
> || val != (long) ((int32) val)
> 
> #endif
> )
> ... bad integer ...
> 
> If there are places where we actually do want a long result, we
> could have two convenience functions, one returning int and one long.
> 
> 
> The exact form of the convenience function is up for grabs, but one
> way we could do it is
> 
> bool pg_int_convert(const char *str, int *value)
> 
> (where a true result indicates a valid integer value).
> Then typical usage would be like
> 
> if (!pg_int_convert(optarg, ) ||
> compresslevel < 0 || compresslevel > 9)
> ... complain-and-exit ...
> 
> There might be something to be said for folding the range checks
> into the convenience function:
> 
> bool pg_int_convert(const char *str, int min, int max, int *value)
> 
> if (!pg_int_convert(optarg, 0, 9, ))
> ... complain-and-exit ...
> 
> That way you could protect code like
> 
>   standby_message_timeout = atoi(optarg) * 1000;
> 
> fairly easily:
> 
> if (!pg_int_convert(optarg, 0, INT_MAX/1000,
> _message_timeout))
> ... complain-and-exit ...
> standby_message_timeout *= 1000;
> 
>   regards, tom lane




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-18 Thread Andreas Joseph Krogh

På mandag 18. september 2017 kl. 16:28:07, skrev Bruce Momjian >:
On Sat, Sep 16, 2017 at 11:36:40PM +0200, Andreas Joseph Krogh wrote:
 > På lørdag 16. september 2017 kl. 18:34:51, skrev Bruce Momjian <
 > br...@momjian.us>:
 >     No.  If you ran initdb with --waldir on the new primary, you will create
 >     a symbolic link in the PGDATA directory, and a directory outside of
 >     PGDATA for storing the WAL.  When you run rsync on the new primary
 >     PGDATA directory, you will copy the symlink in the PGDATA directory, but
 >     it will point to probably nothing on the standby.
 >
 >  
 >  
 > The misunderstanding here comes from the fact that I used pg_upgradecluster
 > like this:
 > pg_upgradecluster --method=upgrade --link 9.6 main
 >  
 > and it didn't issue initdb with --waldir on the new cluster (because
 > pg_upgradecluster issues initdb for you), so pg_wal ended up in $PGDIR 
because
 > pg_upgradecluster didn't figure out the old cluster was initialized with
 > --xlogdir. This is why I thought i made sense mentioning that one had to 
move
 > pg_wal manually.
 >  
 > I know it's debian-stuff, but maybe it's worth mentioning pg_upgradecluster
 > somewhere and recommend not using it? It seems to start the new cluster
 > automatically and when upgrading standbys section 10 tells you not to do 
that.

 So you didn't really follow the instructions, but instead are trying to
 use the standby part of the instructions and found a problem with the
 way pg_upgradecluster handled it.  We really can't document this.

 It would be good to report the bug to pg_upgradecluster developers
 though.

 Yes, I can see rsync not working that case.

Actually I didn't know about --waldir switch of initdb and have always moved 
pg_xlog manually then symlinking. 

>     > should be clearly pointed out that copying pg_wal is only needed in 
those
 >     > cases, and that it can be done with whatever network-copying procedure
 >     you're
 >     > familiar with, ie. scp/rsync. This step is not similar to the steps
 >     required
 >     > for copying tablespaces outside $PGDATA, so it's worth documenting
 >     explicitly.
 >     > Maybe also telling users to ensure the synlink (in $PGDATA) to pg_wal 
on
 >     > standby points to pg_wal.
 >
 >     Why tell them new instructions when the rsync instructions work fine?
 >     What is the value?
 >
 >  
 > The rsync instructions mentioned in 10.F all address the --link scenario and
 > use "--delete --hard-links --size-only", and "merge 2 source-dirs into one",
 > which isn't relevant when copying pg_wal.
 >  
 > This sentence:
 > "If you have relocated pg_wal outside the data directories, rsync must be 
run
 > on those directories too."
 > implies one must follow the rsync pattern elsewhere in 10.F, which isn't 
really
 > true. Maybe re-wording it to:
 > "If you have relocated pg_wal outside the data directories you must copy it
 > over to the new standby, and ensure the symlink from $PGDATA points to it"
 > helps?

 We can't document every possible configuration, especially if a
 secondary tool is used in the middle.

But we're not talking about many different configurations, we're addressing 
when pg_wal is located outside $PGDATA.
So it basically boils down to the last sentence in 10.F:

If you have relocated pg_wal outside the data directories, rsync must be run 
on those directories too.

the word "must" here isn't correct. The point is that you have to copy the 
waldir manually from the primary to the standby and ensure the symlink points 
to this new location on the standby. So I still think something like this is 
better: "If you have relocated pg_wal outside the data directories you must 
copy it over to the new standby, and ensure the symlink from 
$PGDATA/pg_wal points to it". I think it eliminates any doubt and makes the 
instructions complete and easy to follow.

Thanks.

 --
 Andreas Joseph Krogh

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Peter Geoghegan

On Mon, Sep 18, 2017 at 2:07 PM, Robert Haas  wrote:
> On Mon, Sep 18, 2017 at 1:29 PM, Tom Lane  wrote:
>> Uh, why does the planner need to be involved at all?
>
> Because it loses if the Bloom filter fails to filter anything.  That's
> not at all far-fetched; consider SELECT * FROM a.x, b.x WHERE a.x =
> b.x given a foreign key on a.x referencing b(x).

Wouldn't a merge join be a lot more likely in this case anyway? Low
selectivity hash joins with multiple batches are inherently slow; the
wasted overhead of using a bloom filter may not matter.

Obviously this is all pretty speculative. I suspect that this could be
true, and it seems worth investigating that framing of the problem
first.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for pg_basebackup argument parsing

2017-09-18 Thread Tom Lane

Ryan Murphy  writes:
> Looked thru the diffs and it looks fine to me.
> Changing a lot of a integer/long arguments that were being read directly via 
> atoi / atol
> to be read as strings first, to give better error handling.
> 
> This looks good to go to me.  Reviewing this as "Ready for Committer".
> Thanks for the patch, Pierre!

I took a quick look through this and had a few thoughts:

* If we're taking the trouble to sanity-check the input, I think we should
also check for ERANGE (i.e., overflow).

* I concur with Robert that you might as well just test for "*endptr != '\0'"
instead of adding a strlen() call.

* Replicating fiddly little code sequences like this all over the place
makes me itch.  There's too much chance to get it wrong, and even more
risk of someone taking shortcuts.  It has to be not much more complicated
than using atoi(), or we'll just be doing this exercise again in a few
years.  So I'm thinking we need to introduce a convenience function,
perhaps located in src/common/, or else src/fe_utils/.

* Changing variables from int to long int is likely to have unpleasant
consequences on platforms where they're different; or at least, I don't
particularly feel like auditing the patch to ensure that that's not a
problem anywhere.  So I think we should not do that but just make the
convenience function return int, with a suitable overflow test for that
result size.  There's existing logic you can copy in
src/backend/nodes/read.c:

errno = 0;
val = strtol(token, , 10);
if (endptr != token + length || errno == ERANGE
#ifdef HAVE_LONG_INT_64
/* if long > 32 bits, check for overflow of int4 */
|| val != (long) ((int32) val)
#endif
)
... bad integer ...

If there are places where we actually do want a long result, we
could have two convenience functions, one returning int and one long.

The exact form of the convenience function is up for grabs, but one
way we could do it is

bool pg_int_convert(const char *str, int *value)

(where a true result indicates a valid integer value).
Then typical usage would be like

if (!pg_int_convert(optarg, ) ||
compresslevel < 0 || compresslevel > 9)
... complain-and-exit ...

There might be something to be said for folding the range checks
into the convenience function:

bool pg_int_convert(const char *str, int min, int max, int *value)

if (!pg_int_convert(optarg, 0, 9, ))
... complain-and-exit ...

That way you could protect code like

standby_message_timeout = atoi(optarg) * 1000;

fairly easily:

if (!pg_int_convert(optarg, 0, INT_MAX/1000,
_message_timeout))
... complain-and-exit ...
standby_message_timeout *= 1000;

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Robert Haas

On Mon, Sep 18, 2017 at 1:29 PM, Tom Lane  wrote:
> Uh, why does the planner need to be involved at all?

Because it loses if the Bloom filter fails to filter anything.  That's
not at all far-fetched; consider SELECT * FROM a.x, b.x WHERE a.x =
b.x given a foreign key on a.x referencing b(x).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-18 Thread Robert Haas

On Mon, Sep 18, 2017 at 8:02 AM, Ashutosh Bapat
 wrote:
> partition pruning might need partexprs look up relevant quals, but
> nullable_partexprs doesn't have any use there. So may be we should add
> nullable_partexpr to RelOptInfo as part of 0002 (partition-wise join
> implementation) instead of 0001. What do you think?

+1.

>> - I'm not entirely sure whether maintaining partexprs and
>> nullable_partexprs is the right design.  If I understand correctly,
>> whether or not a partexpr is nullable is really a per-RTI property,
>> not a per-expression property.  You could consider something like
>> "Relids nullable_rels".
>
> That's true. However in order to decide whether an expression falls on
> nullable side of a join, we will need to call pull_varnos() on it and
> check the output against nullable_rels. Separating the expressions
> themselves avoids that step.

Good point.  Also, I'm not sure about cases like this:

SELECT * FROM (SELECT b.x, b.y FROM a LEFT JOIN b ON a.x = b.x WHERE
a.y = b.y) w LEFT JOIN c ON w.x = c.x AND w.y = c.y;

Suppose the relations are all partitioned by (x, y) but that the =
operator is not strict.  A partition-wise join is valid between a and
b, but we can't regard w as partitioned any more, because w.x might
contain nulls in partitions where the partitioning scheme wouldn't
allow them.  On the other hand, if the subquery were to select a.x,
a.y then clearly it would be fine: there would be no possibility of a
NULL having been substituted for a proper value.

What if the subquery selected a.x, b.y?  Initially, I thought that
would be OK too, because of the fact that the a.y = b.y clause is in
the WHERE clause rather than the join condition.  But on further
thought I think that probably doesn't work, because with = being a
non-strict operator there's no guarantee that it would remove any
nulls introduced by the left join.  Of course, if the subselect had a
WHERE clause saying that b.x/b.y IS NOT NULL then having the SELECT
list mention those columns would be fine.

>> - The naming of enable_partition_wise_join might also need some
>> thought.  What happens when we also have partition-wise aggregate?
>> What about the proposal to strength-reduce MergeAppend to Append --
>> would that use this infrastructure?  I wonder if we out to call this
>> enable_partition_wise or enable_partition_wise_planning to make it a
>> bit more general.  Then, too, I've never really liked having
>> partition_wise in the GUC name because it might make someone think
>> that it makes you partitions have a lot of wisdom.  Removing the
>> underscore might help: partitionwise.  Or maybe there is some whole
>> different name that would be better.  If anyone wants to bikeshed,
>> now's the time.
>
> partitions having a lot of wisdom would be wise_partitions rather than
> partition_wise ;).

Well, maybe it's the joins that have a lot of wisdom, then.
enable_partition_wise_join could be read to mean that we should allow
partitioning of joins, but only if those joins know the secret of true
happiness.

> If partition-wise join is disabled, partition-wise aggregates,
> strength reduction of MergeAppend won't be possible on a join tree,
> but those will be possible on a base relation. Even if partition-wise
> join enabled, one may want to disable other partition-wise
> optimizations individually. So, they are somewhat independent
> switches. I don't think we should bundle all of those into one.
> Whatever names we choose for those GUCs, I think they should have same
> naming convention e.g. "partition_wise_xyz". I am open to suggestions
> about the names.

I think the chances of you getting multiple GUCs for different
partition-wise optimizations past Tom are pretty low.

>> - Instead of reorganizing add_paths_to_append_rel as you did, could
>> you just add an RTE_JOIN case to the switch?  Not sure if there's some
>> problem with that idea, but it seems like it might come out nicer.
>
> RTE_JOIN is created only for joins specified using JOIN clause i.e
> syntactic joins. The joins created during query planner like rel1,
> rel2, rel3 do not have RTE_JOIN. So, we can't use RTE_JOIN there.

OK, never mind that then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small code improvement for btree

2017-09-18 Thread Tom Lane

Doug Doole  writes:
> Looks good to me.
> The new status of this patch is: Ready for Committer

Pushed.  Grepping found a few more places that should be changed to
use these macros rather than referencing btpo_flags directly,
so I did that.

I tend to agree with Alvaro that it'd be better to get rid of
BT_READ/BT_WRITE in favor of using the same buffer flags used
elsewhere; but I also agree that as long as they're there we
should use them, so I included that change as well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Hash take II

2017-09-18 Thread Robert Haas

On Thu, Sep 14, 2017 at 10:01 AM, Thomas Munro
 wrote:
> 3.  Gather Merge and Parallel Hash Join may have a deadlock problem.
> Since Gather Merge needs to block waiting for tuples, but workers wait
> for all participants (including the leader) to reach barriers.  TPCH
> Q18 (with a certain set of indexes and settings, YMMV) has Gather
> Merge over Sort over Parallel Hash Join, and although it usually runs
> successfully I have observed one deadlock.  Ouch.  This seems to be a
> more fundamental problem than the blocked TupleQueue scenario.  Not
> sure what to do about that.

Thomas and I spent about an hour and a half brainstorming about this
just now.  Parallel query doesn't really have a documented deadlock
avoidance strategy, yet all committed and proposed patches other than
this one manage to avoid deadlock.  This one has had a number of
problems crop up in this area, so it struck me that it might be
violating a rule which every other patch was following.  I struggled
for a bit and finally managed to articulate what I think the
deadlock-avoidance rule is that is generally followed by other
committed and proposed patches:

Once you enter a state in which other participants might wait for you,
you must exit that state before doing anything that might wait for
another participant.

>From this, it's easy to see that the waits-for graph can't contain any
cycles: if every parallel query node obeys the above rule, then a
given node can have in-arcs or out-arcs, but not both.  I also believe
it to be the case that every existing node follows this rule.  For
instance, Gather and Gather Merge wait for workers, but they aren't at
that point doing anything that can make the workers wait for them.
Parallel Bitmap Heap Scan waits for the leader to finish building the
bitmap, but that leader never waits for anyone else while building the
bitmap.  Parallel Index(-Only) Scan waits for the process advancing
the scan to reach the next page, but that process never waits for any
other while so doing.  Other types of parallel nodes -- including the
proposed Parallel Append node, which is an interesting case because
like Parallel Hash it appears in the "middle" of the parallel portion
of the plan tree rather than the root like Gather or the leaves like a
parallel scan -- don't wait at all, except for short
spinlock-protected or LWLock-protected critical sections during which
they surely don't go into any sort of long-term wait (which would be
unacceptable for other reasons anyway).

Parallel hash violates this rule only in the case of a multi-batch
hash join, and for only one reason: to avoid blowing out work_mem.
Since, consistent with resource management decisions elsewhere, each
participant is entitled to an amount of memory equal to work_mem, the
shared hash table can and does use up to (participants * work_mem),
which means that we must wait for everybody to be done with the hash
table for batch N before building the hash table for batch N+1.  More
properly, if the hash table for the current batch happens to be
smaller than the absolute maximum amount of memory we can use, we can
build the hash table for the next batch up to the point where all the
memory is used, but must then pause and wait for the old hash table to
go away before continuing.  But that means that the process for which
we are waiting violated the rule mentioned above: by not being done
with the memory, it's making other processes wait, and by returning a
tuple, it's allowing other parts of the executor to do arbitrary
computations which can themselves wait.  So, kaboom.

One simple and stupid way to avoid this deadlock is to reduce the
memory budget for the shared hash table to work_mem and remove the
barriers that prevent more than one such hash table from existing at a
time.  In the worst case, we still use (participants * work_mem),
frequently we'll use less, but there are no longer any waits for
processes that might not even be running the parallel has node
(ignoring the moment the problem of right and full parallel hash
joins, which might need more thought).  So no deadlock.

We can do better.  First, as long as nbatches == 1, we can use a hash
table of up to size (participants * work_mem); if we have to switch to
multiple batches, then just increase the number of batches enough that
the current memory usage drops below work_mem.  Second, following an
idea originally by Ashutosh Bapat whose relevance to this issue Thomas
Munro realized during our discussion, we can make all the batches
small enough to fit in work_mem (rather than participants * work_mem
as the current patch does) and spread them across the workers (in the
style of Parallel Append, including potentially deploying multiple
workers against the same batch if there are fewer batches than
workers).  Then, single-batch parallel hash joins use the maximum
allowable memory always, and multi-batch parallel hash joins use the
maximum allowable memory after

Re: [HACKERS] [PATCH] Make ExplainBeginGroup()/ExplainEndGroup() public.

2017-09-18 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jul 25, 2017 at 9:54 PM, Kyotaro HORIGUCHI
>  wrote:
>> The patch is a kind of straightforward and looks fine for me.

> +1 for this change.

LGTM too, pushed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.

2017-09-18 Thread Tom Lane

Andres Freund  writes:
> On 2017-09-18 12:16:42 -0400, Robert Haas wrote:
>> I don't understand what you're talking about here.

> I often see a backend crash, psql reacting to that crash by
> reconnecting, successfully establish a new connection, just to be kicked
> off by postmaster that does the crash restart cycle.  I've not yet
> figured out when exactly this happens and when not.

It seems like this must indicate that psql sees the connection drop
significantly before the postmaster gets SIGCHLD.  Which is weird, but
if you have core dumps enabled, maybe your kernel is doing something like
(1) close files, (2) dump core, (3) exit process (resulting in SIGCHLD)?

I don't think I've ever seen this myself.  Peculiarity of some kernels,
perhaps.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG_GETARG_GISTENTRY?

2017-09-18 Thread Tom Lane

Mark Dilger  writes:
> I have written a patch to fix these macro definitions across src/ and 
> contrib/.
> Find the patch, attached.  All regression tests pass on my Mac laptop.

Pushed after some rebasing and some minor additional editorialization.

The original point about adding a wrapper for GISTENTRY fetches remains
open ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.

2017-09-18 Thread Andres Freund

On 2017-09-18 12:16:42 -0400, Robert Haas wrote:
> On Mon, Sep 18, 2017 at 6:32 AM, Andres Freund  wrote:
> > One thing that I've noticed for a while, but that I was reminded of
> > again here. We very frequently allow psql to reconnect in case of crash,
> > just for postmaster to notice a child has gone and kill that session. I
> > don't recall that frequently happening, but these days it happens nearly
> > every time.
> 
> I don't understand what you're talking about here.

I often see a backend crash, psql reacting to that crash by
reconnecting, successfully establish a new connection, just to be kicked
off by postmaster that does the crash restart cycle.  I've not yet
figured out when exactly this happens and when not.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-18 Thread Pavel Stehule

2017-09-18 11:44 GMT+02:00 Alvaro Herrera :

> Pavel Stehule wrote:
>
> > I am looking on man pagers - and there are very well readable
> >
> > The rules are simply - when some variables are short - less than 6 chars,
> > then it description and label are on same line. Between items are empty
> > line
>
> I was having a similar idea.  I'm not sure how many lines we would save
> that way, but it seems to me we could use a slightly longer threshold --
> say 8 chars before we cause a newline and wrap.
>
> Maybe we need to come up with guidelines for translators, also.  As
> somebody noted, I've always gone to some trouble to produce translate
> output that preserves the original's constraints; I don't see any reason
> forother translations to behave differently.
>
> (By all means, if we add entry-separating newlines, let's not put them
> as part of the translatable string.)
>

+1

Pavel


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] src/test/subscription/t/005_encoding.pl is broken

2017-09-18 Thread Tom Lane

Andres Freund  writes:
> On 2017-09-18 11:50:06 -0400, Tom Lane wrote:
>> The reason seems to be that its method of waiting for replication
>> to happen is completely inapropos.  It's watching for the master
>> to say that the slave has received all the WAL, but that does not
>> ensure that the logicalrep apply workers have caught up, does it?

> To my knowledge here's not really any difference between the two in
> logical replication. Received changes are immediately applied, there's
> no equivalent to a walreceiver queing up "logical wal" onto disk.

> So I'm not sure that theory holds.

Well, there's *something* wrong with this test's wait method.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] src/test/subscription/t/005_encoding.pl is broken

2017-09-18 Thread Andres Freund

Hi,

On 2017-09-18 11:50:06 -0400, Tom Lane wrote:
> The reason seems to be that its method of waiting for replication
> to happen is completely inapropos.  It's watching for the master
> to say that the slave has received all the WAL, but that does not
> ensure that the logicalrep apply workers have caught up, does it?

To my knowledge here's not really any difference between the two in
logical replication. Received changes are immediately applied, there's
no equivalent to a walreceiver queing up "logical wal" onto disk.

So I'm not sure that theory holds.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Peter Geoghegan

On Mon, Sep 18, 2017 at 10:29 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Aug 29, 2017 at 10:22 PM, Thomas Munro
>>  wrote:
>>> (2) We could push a Bloom filter down to scans
>>> (many other databases do this, and at least one person has tried this
>>> with PostgreSQL and found it to pay off[1]).
>
>> I think the hard part is going to be figuring out a query planner
>> framework for this, because pushing down the Bloom filter down to the
>> scan changes the cost and the row-count of the scan.
>
> Uh, why does the planner need to be involved at all?  This seems like
> strictly an execution-time optimization.  Even if you wanted to try
> to account for it in costing, I think the reliability of the estimate
> would be nil, never mind any questions about whether the planner's
> structure makes it easy to apply such an adjustment.

That is how I think about it too, though I'm not at all confident of
being on the right track. (I'm pretty confident that the general idea
of using a Bloom filter for parallel hash join is a good idea,
though.)

I should point out that Bloom filters are quite insensitive to
misestimations in the total number of elements, an estimate of which
must be provided up-front (I suppose that a cardinality estimate might
make more sense for hash join bloom filters than an estimate of the
total number of elements in a batch). "Bloom Filters in Probabilistic
Verification" gives this as a key advantage for bloom filters over
other approaches to formal model verification [1]. If you can afford
to have significant misestimations and still not lose much benefit,
and if the additional overhead of having a Bloom filter is fixed and
reasonably low cost, then maybe it makes sense to apply bloom filters
as an opportunistic or optimistic optimization. Perhaps they can be
applied when there is little to lose but much to gain.

[1] http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Tom Lane

Robert Haas  writes:
> On Tue, Aug 29, 2017 at 10:22 PM, Thomas Munro
>  wrote:
>> (2) We could push a Bloom filter down to scans
>> (many other databases do this, and at least one person has tried this
>> with PostgreSQL and found it to pay off[1]).

> I think the hard part is going to be figuring out a query planner
> framework for this, because pushing down the Bloom filter down to the
> scan changes the cost and the row-count of the scan.

Uh, why does the planner need to be involved at all?  This seems like
strictly an execution-time optimization.  Even if you wanted to try
to account for it in costing, I think the reliability of the estimate
would be nil, never mind any questions about whether the planner's
structure makes it easy to apply such an adjustment.

Personally though I would not bother with (2); I think (1) would
capture most of the win for a very small fraction of the complication.
Just for starters, I do not think (2) works for batched hashes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting query on crash even if completed

2017-09-18 Thread Magnus Hagander

On Mon, Sep 18, 2017 at 6:12 PM, Robert Haas  wrote:

> On Mon, Sep 18, 2017 at 9:47 AM, Tom Lane  wrote:
> > Now, for pg_stat_activity part of the argument why this wouldn't be
> > confusing was that you could also see the "state" field.  Maybe we
> > should try to shoehorn equivalent info into the crash log entry?
>
> Yeah, I think so.  Really, I think this is an inadvertency, and thus a
>

Yeah, I agree. That was nothing I recall thinking about at the time, so
strictly speaking it's a bug.



> bug.  But instead of just not showing the query when the backend is
> idle, I'd change the display for that case to:
>
> DETAIL: Failed process was idle; last query was: %s
>
> Or something like that.  I guess we'd need another case for a backend
> that crashed without ever running a query.
>

+1, this is a better solution.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] Reporting query on crash even if completed

2017-09-18 Thread Tom Lane

Robert Haas  writes:
> On Mon, Sep 18, 2017 at 9:47 AM, Tom Lane  wrote:
>> Now, for pg_stat_activity part of the argument why this wouldn't be
>> confusing was that you could also see the "state" field.  Maybe we
>> should try to shoehorn equivalent info into the crash log entry?

> Yeah, I think so.  Really, I think this is an inadvertency, and thus a
> bug.  But instead of just not showing the query when the backend is
> idle, I'd change the display for that case to:

> DETAIL: Failed process was idle; last query was: %s

WFM.

> Or something like that.  I guess we'd need another case for a backend
> that crashed without ever running a query.

We already print nothing in that case, which seems fine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boom filters for hash joins (was: A design for amcheck heapam verification)

2017-09-18 Thread Robert Haas

On Tue, Aug 29, 2017 at 10:22 PM, Thomas Munro
 wrote:
> Indeed.  Thank you for working on this!  To summarise a couple of
> ideas that Peter and I discussed off-list a while back:  (1) While
> building the hash table for a hash join we could build a Bloom filter
> per future batch and keep it in memory, and then while reading from
> the outer plan we could skip writing tuples out to future batches if
> there is no chance they'll find a match when read back in later (works
> only for inner joins and only pays off in inverse proportion to the
> join's selectivity);

You could try to use this with LEFT JOINs also, I think.  If you find
that a row can't match later, don't write it to the batch file;
instead, return a null-extended row immediately.

> (2) We could push a Bloom filter down to scans
> (many other databases do this, and at least one person has tried this
> with PostgreSQL and found it to pay off[1]).

I think the hard part is going to be figuring out a query planner
framework for this, because pushing down the Bloom filter down to the
scan changes the cost and the row-count of the scan.  This is
essentially a less-tractable version of the problem of deciding how
many parallel workers to use; in that case, at least, the number of
works must be an integer, so we could resort to trying them all, if
the costing model were otherwise good enough.  But here the thing that
matters is selectivity, which is a continuous rather than discrete.
Consider:

SELECT * FROM A LEFT JOIN (B INNER JOIN C ON B.x = C.x) ON A.y = B.y
LEFT JOIN D ON A.z > D.z;

Since the A-D join is not an equijoin, we'll have to implement it as a
nested loop; a case could also arise where a nested loop is the only
efficient strategy even for an equijoin, for example if D is huge.

Now, suppose the join to B-C is going to knock out 90% of the rows and
the join to D is going to knock out 50% of the rows, and that these
two percentages are independent of each other.  Normally, it would be
a slam-dunk to perform the B-C join first, but the Bloom filter
muddies the waters, because it might well be the case that what we
should really do is build the Bloom filter, apply it, then do the A-D
join, and do the A-(B-C) join last.  However, in order for the planner
to find that plan, it's going to need to be able to cost the nestloop
with an outer path that corresponds to a scan of A with the Bloom
filter applied.  And where is that path going to come from?

Note that it can't use that Bloom-filtered path for A unconditionally
because we might not opt to implement the A-(B-C) join as a hash join
at all; we have to build paths for NestLoop(A,D) and
NestLoop(Bloom(A),D) and we have to keep track of the fact that the
latter path has a dependency on a future hash join.  Things get
rapidly more complicated if you've got 10 or so different joins.  We
have to consider generate scan paths for the driving table with every
combination of Bloom filters that could be generated by subsequent
joins, and then all of those scan paths have to be carried through all
of the possible joinrels that can be generated later.  I think it's
not crazy to think that this could increase the cost of planning a
factor by 2^(# of joins).

Presumably some other method needs to be found, but I'm not clear what
that other method is.  Our planner is fundamentally bottom-up, and
it's not at all clear how to make it capable of changing its mind
later about something like the number of rows that a scan of A will
return, or how much that scan will cost.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GnuTLS support

2017-09-18 Thread Jeff Janes

On Sun, Sep 17, 2017 at 2:17 PM, Andreas Karlsson  wrote:

> On 09/15/2017 06:55 PM, Jeff Janes wrote:
>
>> I can't build against gnutls-2.12.23-21.el6.x86_64 from CentOS 6.9
>>
>
> Thanks for testing my patch. I have fixed both these issues plus some of
> the other feedback. A new version of my patch is attached which should, at
> least on theory, support all GnuTLS versions >= 2.11.
>

You fixed the first issue, but I still get the second one:

be-secure-gnutls.c: In function 'get_peer_certificate':
be-secure-gnutls.c:667: error: 'GNUTLS_X509_CRT_LIST_SORT' undeclared
(first use in this function)
be-secure-gnutls.c:667: error: (Each undeclared identifier is reported only
once
be-secure-gnutls.c:667: error: for each function it appears in.)

Cheers,

Jeff

Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files

2017-09-18 Thread chenhj


At 2017-09-17 08:33:33, "Michael Paquier"  wrote:
>On Sun, Sep 17, 2017 at 3:19 AM, Alexander Korotkov
> wrote:
>> Hi!
>>
>> On Sat, Sep 16, 2017 at 5:56 PM, chenhj  wrote:
>>>
>>> This patch optimizes the above mentioned issues, as follows:
>>> 1. In the target data directory, do not delete the WAL files before the
>>> divergence.
>>> 2. When copying files from the source server, do not copy the WAL files
>>> before the divergence and the WAL files after the current WAL insert
>>> localtion.
>>
>>
>> Looks like cool optimization for me.  Please, add this patch to the next
>> commitfest.
>
>Agreed.
>
>> Do you think this patch should modify pg_rewind tap tests too?  It would be
>> nice to make WAL files fetching more covered by tap tests.  In particular,
>> new tests may generate more WAL files and make sure that pg_rewind fetches
>> only required files among them.
>
>This looks mandatory to me. Using pg_switch_wal() and a minimum amount
>of WAL generated you could just make the set of WAL segments skipped
>minimal data.
>
>I have not checked in details, but I think that the positions where
>you are applying the filters are using the right approach.
>
>! !(strncmp(path, "pg_wal", 6) == 0 && IsXLogFileName(path + 7) &&
>Please use XLOGDIR here.
>-- 
>Michael
>

Thanks, I will use XLOGDIR and add TAP tests later.


--
Chen Huajun

Re: [HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module

2017-09-18 Thread Andrew Dunstan



On 09/18/2017 12:50 PM, Tom Lane wrote:
> Alexey Chernyshov  writes:
>> Do we need expected/citext.out? It seems that only
>> expected/citext_1.out has correct output.
> Well, for me, citext.out matches the results in C locale,
> and citext_1.out matches the results in en_US.  If you don't
> satisfy both of those cases, the buildfarm will not like you.
>
>   


I'm about to pick this one up - I will handle the expected file issue.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.

2017-09-18 Thread Simon Riggs

On 16 September 2017 at 21:27, Andres Freund  wrote:
> On 2017-09-16 15:59:01 -0400, Tom Lane wrote:

>> This does not seem like a problem that justifies a system-wide change
>> that's much more delicate than you thought.

+1

The change/reason ratio is too high.


-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module

2017-09-18 Thread Tom Lane

Alexey Chernyshov  writes:
> Do we need expected/citext.out? It seems that only
> expected/citext_1.out has correct output.

Well, for me, citext.out matches the results in C locale,
and citext_1.out matches the results in en_US.  If you don't
satisfy both of those cases, the buildfarm will not like you.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] src/test/subscription/t/002_types.pl hanging on particular environment

2017-09-18 Thread Tom Lane

Amit Kapila  writes:
> On Mon, Sep 18, 2017 at 7:46 PM, Tom Lane  wrote:
>> The subscriber log includes
>> 2017-09-18 08:43:08.240 UTC [15672] WARNING:  out of background worker slots
>> Maybe that's harmless, but I'm suspicious that it's a smoking gun.

> I have noticed while fixing the issue you are talking that this path
> is also susceptible to such problems.  In
> WaitForReplicationWorkerAttach(), it relies on
> GetBackgroundWorkerPid() to know the status of the worker which won't
> give the correct status in case of fork failure.  The patch I have
> posted has a fix for the issue,

Your GetBackgroundWorkerPid fix looks good as far as it goes, but
I feel that WaitForReplicationWorkerAttach's problems go way deeper
than that --- in fact, that function should not exist at all IMO.

The way that launcher.c is set up, it's relying on either the calling
process or the called process to free the logicalrep slot when done.
The called process might never exist at all, so the second half of
that is obviously unreliable; but WaitForReplicationWorkerAttach
can hardly be relied on either, seeing it's got a big fat exit-on-
SIGINT in it.  I thought about putting a PG_TRY, or more likely
PG_ENSURE_ERROR_CLEANUP, into it, but that doesn't fix the basic
problem: you can't assume that the worker isn't ever going to start,
just because you got a signal that means you shouldn't wait anymore.

Now, this rickety business might be fine if it were only about the
states of the caller and called processes, but we've got long-lived
shared data involved (ie the worker slots); failing to clean it up
is not an acceptable outcome.

So, frankly, I think we would be best off losing the "logical rep
worker slot" business altogether, and making do with just bgworker
slots.  The alternative is getting the postmaster involved in cleaning
up lrep slots as well as bgworker slots, and I'm going to resist
any such idea strongly.  That way madness lies, or at least an
unreliable postmaster --- if we need lrep slots, what other forty-seven
data structures are we going to need the postmaster to clean up
in the future?

I haven't studied the code enough to know why it grew lrep worker
slots in the first place.  Maybe someone could explain?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c oblivion of worker-startup failures

2017-09-18 Thread Tom Lane

Amit Kapila  writes:
> Attached patch fixes these problems.

Hmm, this patch adds a kill(notify_pid) after one call to
ForgetBackgroundWorker, but the postmaster has several more such calls.
Shouldn't they all notify the notify_pid?  Should we move that
functionality into ForgetBackgroundWorker itself, so we can't forget
it again?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.

2017-09-18 Thread Robert Haas

On Mon, Sep 18, 2017 at 6:32 AM, Andres Freund  wrote:
> One thing that I've noticed for a while, but that I was reminded of
> again here. We very frequently allow psql to reconnect in case of crash,
> just for postmaster to notice a child has gone and kill that session. I
> don't recall that frequently happening, but these days it happens nearly
> every time.

I don't understand what you're talking about here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting query on crash even if completed

2017-09-18 Thread Robert Haas

On Mon, Sep 18, 2017 at 9:47 AM, Tom Lane  wrote:
> Now, for pg_stat_activity part of the argument why this wouldn't be
> confusing was that you could also see the "state" field.  Maybe we
> should try to shoehorn equivalent info into the crash log entry?

Yeah, I think so.  Really, I think this is an inadvertency, and thus a
bug.  But instead of just not showing the query when the backend is
idle, I'd change the display for that case to:

DETAIL: Failed process was idle; last query was: %s

Or something like that.  I guess we'd need another case for a backend
that crashed without ever running a query.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] src/test/subscription/t/005_encoding.pl is broken

2017-09-18 Thread Tom Lane

To reproduce the subscription-startup hang that Thomas Munro observed,
I changed src/backend/replication/logical/launcher.c like this:

@@ -427,7 +427,8 @@ retry:
bgw.bgw_notify_pid = MyProcPid;
bgw.bgw_main_arg = Int32GetDatum(slot);
 
-   if (!RegisterDynamicBackgroundWorker(, _handle))
+   if (random() < 10 ||
+   !RegisterDynamicBackgroundWorker(, _handle))
{
/* Failed to start worker, so clean up the worker slot. */
LWLockAcquire(LogicalRepWorkerLock, LW_EXCLUSIVE);

This causes about 50% of worker launch requests to fail.
With the fix I just committed, 002_types.pl gets through fine,
but 005_encoding.pl does not; it sometimes fails like this:

t/005_encoding.pl . 1/1 
#   Failed test 'data replicated to subscriber'
#   at t/005_encoding.pl line 49.
#  got: ''
# expected: '1'
# Looks like you failed 1 test of 1.
t/005_encoding.pl . Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/1 subtests 

The reason seems to be that its method of waiting for replication
to happen is completely inapropos.  It's watching for the master
to say that the slave has received all the WAL, but that does not
ensure that the logicalrep apply workers have caught up, does it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Joshua D. Drake


On 09/18/2017 04:54 AM, Robert Haas wrote:

On Mon, Sep 18, 2017 at 7:17 AM, Andres Freund  wrote:

Private:


Not so much.



Well, as much as the Internet is actually private:

https://ilccyberreport.files.wordpress.com/2013/06/nsa11.jpg

JD

;)


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] hash partitioning

2017-09-18 Thread Jesper Pedersen


On 09/15/2017 02:30 AM, amul sul wrote:

Attached rebased patch, thanks.



While reading through the patch I thought it would be better to keep 
MODULUS and REMAINDER in caps, if CREATE TABLE was in caps too in order 
to highlight that these are "keywords" for hash partition.


Also updated some of the documentation.

V20 patch passes make check-world, and my testing (typical 64 
partitions, and various ATTACH/DETACH scenarios).


Thanks for working on this !

Best regards,
 Jesper
>From 189a40a5ca6c7a1bc79b750cbc95584b3061fda5 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Mon, 18 Sep 2017 11:13:54 -0400
Subject: [PATCH] * Documentation updates * Use caps for MODULUS / REMAINDER
 when CREATE TABLE is so too

---
 doc/src/sgml/ddl.sgml  | 10 -
 doc/src/sgml/ref/alter_table.sgml  |  2 +-
 src/backend/catalog/partition.c|  8 +++
 src/test/regress/expected/alter_table.out  | 20 +-
 src/test/regress/expected/create_table.out | 34 +++---
 src/test/regress/sql/alter_table.sql   | 20 +-
 src/test/regress/sql/create_table.sql  | 28 
 7 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 24b36caad3..e38d8fc0a0 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2881,11 +2881,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
 

 
- The table is partitioned by specifying modulus and remainder for each
- partition. Each partition holds rows for which the hash value of
- partition keys when divided by specified modulus produces specified
- remainder. For more clarification on modulus and remainder please refer
- .
+ The table is partitioned by specifying a modulus and a remainder for each
+ partition. Each partition will hold the rows for which the hash value of
+ the partition key divided by the specified modulus will produce the specified
+ remainder. Refer to 
+ for additional clarification on modulus and remainder.
 

   
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index a6eefb8564..b5fb93edac 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -1424,7 +1424,7 @@ ALTER TABLE cities
Attach a partition to hash partitioned table:
 
 ALTER TABLE orders
-ATTACH PARTITION orders_p4 FOR VALUES WITH (modulus 4, remainder 3);
+ATTACH PARTITION orders_p4 FOR VALUES WITH (MODULUS 4, REMAINDER 3);
 
 
   
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 5c11c7ecea..3696b9a711 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1694,13 +1694,13 @@ make_partition_op_expr(PartitionKey key, int keynum,
  *	CREATE TABLE simple_hash (a int, b char(10)) PARTITION BY HASH (a, b);
  *
  * CREATE TABLE p_p1 PARTITION OF simple_hash
- *	FOR VALUES WITH (modulus 2, remainder 1);
+ *	FOR VALUES WITH (MODULUS 2, REMAINDER 1);
  * CREATE TABLE p_p2 PARTITION OF simple_hash
- *	FOR VALUES WITH (modulus 4, remainder 2);
+ *	FOR VALUES WITH (MODULUS 4, REMAINDER 2);
  * CREATE TABLE p_p3 PARTITION OF simple_hash
- *	FOR VALUES WITH (modulus 8, remainder 0);
+ *	FOR VALUES WITH (MODULUS 8, REMAINDER 0);
  * CREATE TABLE p_p4 PARTITION OF simple_hash
- *	FOR VALUES WITH (modulus 8, remainder 4);
+ *	FOR VALUES WITH (MODULUS 8, REMAINDER 4);
  *
  * This function will return one of the following in the form of an
  * expression:
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 124cbe483c..304fb97291 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -3490,22 +3490,22 @@ CREATE TABLE hash_parted (
 	a int,
 	b int
 ) PARTITION BY HASH (a custom_opclass);
-CREATE TABLE hpart_1 PARTITION OF hash_parted FOR VALUES WITH (modulus 4, remainder 0);
+CREATE TABLE hpart_1 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 4, REMAINDER 0);
 CREATE TABLE fail_part (LIKE hpart_1);
-ALTER TABLE hash_parted ATTACH PARTITION fail_part FOR VALUES WITH (modulus 8, remainder 4);
+ALTER TABLE hash_parted ATTACH PARTITION fail_part FOR VALUES WITH (MODULUS 8, REMAINDER 4);
 ERROR:  partition "fail_part" would overlap partition "hpart_1"
-ALTER TABLE hash_parted ATTACH PARTITION fail_part FOR VALUES WITH (modulus 8, remainder 0);
+ALTER TABLE hash_parted ATTACH PARTITION fail_part FOR VALUES WITH (MODULUS 8, REMAINDER 0);
 ERROR:  partition "fail_part" would overlap partition "hpart_1"
 DROP TABLE fail_part;
 -- check validation when attaching hash partitions
 -- check that violating rows are correctly reported
 CREATE TABLE hpart_2 (LIKE hash_parted);
 INSERT INTO hpart_2 VALUES (3, 0);
-ALTER TABLE hash_parted ATTACH PARTITION hpart_2 FOR VALUES

Re: [HACKERS] compress method for spgist - 2

2017-09-18 Thread Alexander Korotkov

On Tue, Aug 25, 2015 at 4:05 PM, Michael Paquier 
wrote:

> On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev  wrote:
> >>> Poorly, by hanging boxes that straddled dividing lines off the parent
> >>> node in a big linear list. The hope would be that the case was
> >>
> >> Ok, I see, but that's not really what I was wondering. My question is
> >> this:
> >> SP-GiST partitions the space into non-overlapping sections. How can you
> >> store
> >> polygons - which can overlap - in an SP-GiST index? And how does the
> >> compress
> >> method help with that?
> >
> >
> > I believe if we found a way to index boxes then we will need a compress
> > method to build index over polygons.
> >
> > BTW, we are working on investigation a index structure for box where
> 2d-box
> > is treated as 4d-point.
>
> There has been no activity on this patch for some time now, and a new
> patch version has not been submitted, so I am marking it as return
> with feedback.
>

There is interest to this patch from PostGIS users.  It would be nice to
pickup this patch.
AFAICS, the progress on this patch was suspended because we have no example
for SP-GiST compress method in core/contrib.
However, now we have acdf2a8b committed with 2d to 4d indexing of boxes
using SP-GiST.  So, extending this 2d to 4d approach to polygons would be
good example of SP-GiST compress method in core.  Would anyone be
volunteering for writing such patch?
If nobody, then I could do it

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] UPDATE of partition key

2017-09-18 Thread Dilip Kumar

On Mon, Sep 18, 2017 at 11:29 AM, Dilip Kumar  wrote:
> On Fri, Sep 15, 2017 at 4:55 PM, Amit Khandekar  
> wrote:
>> On 12 September 2017 at 12:39, Amit Khandekar  wrote:
>>> On 12 September 2017 at 11:57, Dilip Kumar  wrote:
 On Tue, Sep 12, 2017 at 11:15 AM, Amit Khandekar  
 wrote:


>> In the attached patch, we first call ExecARUpdateTriggers(), and while
>> doing that, we first save the info that a NEW row is already captured
>> (mtstate->mt_transition_capture->tcs_update_old_table == true). If it
>> captured, we pass NULL transition_capture pointer to
>> ExecARDeleteTriggers() (and ExecARInsertTriggers) so that it does not
>> again capture an extra row.
>>
>> Modified a testcase in update.sql by including DELETE statement
>> trigger that uses transition tables.
>
> Ok, this fix looks correct to me, I will review the latest patch.

Please find few more comments.

+ * in which they appear in the PartitionDesc. Also, extract the
+ * partition key columns of the root partitioned table. Those of the
+ * child partitions would be collected during recursive expansion.
*/
+ pull_child_partition_columns(_part_cols, oldrelation, oldrelation);
expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
  lockmode, >append_rel_list,
+   _part_cols,

pcinfo->all_part_cols is only used in case of update, I think we can
call pull_child_partition_columns
only if rte has updateCols?

@@ -2029,6 +2034,7 @@ typedef struct PartitionedChildRelInfo

Index parent_relid;
List   *child_rels;
+ Bitmapset  *all_part_cols;
} PartitionedChildRelInfo;

I might be missing something, but do we really need to store
all_part_cols inside the
PartitionedChildRelInfo,  can't we call pull_child_partition_columns
directly inside
inheritance_planner whenever we realize that RTE has some updateCols
and we want to
check the overlap?

+extern void partition_walker_init(PartitionWalker *walker, Relation rel);
+extern Relation partition_walker_next(PartitionWalker *walker,
+  Relation *parent);
+

I don't see these functions are used anywhere?

+typedef struct PartitionWalker
+{
+ List   *rels_list;
+ ListCell   *cur_cell;
+} PartitionWalker;
+

Same as above



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] src/test/subscription/t/002_types.pl hanging on particular environment

2017-09-18 Thread Amit Kapila

On Mon, Sep 18, 2017 at 7:46 PM, Tom Lane  wrote:
> Thomas Munro  writes:
>> In this build you can see the output of the following at the end,
>> which might provide clues to the initiated.  You might need to click a
>> small triangle to unfold the commands' output.
>
>> cat ./src/test/subscription/tmp_check/log/002_types_publisher.log
>> cat ./src/test/subscription/tmp_check/log/002_types_subscriber.log
>> cat ./src/test/subscription/tmp_check/log/regress_log_002_types
>
> The subscriber log includes
> 2017-09-18 08:43:08.240 UTC [15672] WARNING:  out of background worker slots
> 2017-09-18 08:43:08.240 UTC [15672] HINT:  You might need to increase 
> max_worker_processes.
>
> Maybe that's harmless, but I'm suspicious that it's a smoking gun.
> I think perhaps this reflects a failed attempt to launch a worker,
> which the caller does not realize has failed to launch because of the
> lack of worker-fork-failure error recovery I bitched about months ago
> [1], leading to subscription startup waiting forever for a worker that's
> never going to report finishing.
>
> I see Amit K. just posted a patch in that area [2], haven't looked at it
> yet.
>

I have noticed while fixing the issue you are talking that this path
is also susceptible to such problems.  In
WaitForReplicationWorkerAttach(), it relies on
GetBackgroundWorkerPid() to know the status of the worker which won't
give the correct status in case of fork failure.  The patch I have
posted has a fix for the issue, however, this could be an entirely
different issue altogether as it appears from your next email in this
thread.


[1] - 
https://www.postgresql.org/message-id/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB%3D3uw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] src/test/subscription/t/002_types.pl hanging on particular environment

2017-09-18 Thread Tom Lane

I wrote:
> The subscriber log includes
> 2017-09-18 08:43:08.240 UTC [15672] WARNING:  out of background worker slots
> 2017-09-18 08:43:08.240 UTC [15672] HINT:  You might need to increase 
> max_worker_processes.

> Maybe that's harmless, but I'm suspicious that it's a smoking gun.

Actually: looking at the place where this is issued, namely
logicalrep_worker_launch, and comparing it to the cleanup logic
in WaitForReplicationWorkerAttach, it looks to me like the problem
is we're failing to do logicalrep_worker_cleanup() in this case.
We have initialized a logical rep worker slot, and we're just
walking away from it.

I'll go see if I can't reproduce this by injecting random failures right
there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-18 Thread Bruce Momjian

On Sat, Sep 16, 2017 at 11:36:40PM +0200, Andreas Joseph Krogh wrote:
> På lørdag 16. september 2017 kl. 18:34:51, skrev Bruce Momjian <
> br...@momjian.us>:
> No.  If you ran initdb with --waldir on the new primary, you will create
> a symbolic link in the PGDATA directory, and a directory outside of
> PGDATA for storing the WAL.  When you run rsync on the new primary
> PGDATA directory, you will copy the symlink in the PGDATA directory, but
> it will point to probably nothing on the standby.
> 
>  
>  
> The misunderstanding here comes from the fact that I used pg_upgradecluster
> like this:
> pg_upgradecluster --method=upgrade --link 9.6 main
>  
> and it didn't issue initdb with --waldir on the new cluster (because
> pg_upgradecluster issues initdb for you), so pg_wal ended up in $PGDIR because
> pg_upgradecluster didn't figure out the old cluster was initialized with
> --xlogdir. This is why I thought i made sense mentioning that one had to move
> pg_wal manually.
>  
> I know it's debian-stuff, but maybe it's worth mentioning pg_upgradecluster
> somewhere and recommend not using it? It seems to start the new cluster
> automatically and when upgrading standbys section 10 tells you not to do that.

So you didn't really follow the instructions, but instead are trying to
use the standby part of the instructions and found a problem with the
way pg_upgradecluster handled it.  We really can't document this.

It would be good to report the bug to pg_upgradecluster developers
though.

Yes, I can see rsync not working that case.

> > should be clearly pointed out that copying pg_wal is only needed in 
> those
> > cases, and that it can be done with whatever network-copying procedure
> you're
> > familiar with, ie. scp/rsync. This step is not similar to the steps
> required
> > for copying tablespaces outside $PGDATA, so it's worth documenting
> explicitly.
> > Maybe also telling users to ensure the synlink (in $PGDATA) to pg_wal on
> > standby points to pg_wal.
> 
> Why tell them new instructions when the rsync instructions work fine?
> What is the value?
> 
>  
> The rsync instructions mentioned in 10.F all address the --link scenario and
> use "--delete --hard-links --size-only", and "merge 2 source-dirs into one",
> which isn't relevant when copying pg_wal.
>  
> This sentence:
> "If you have relocated pg_wal outside the data directories, rsync must be run
> on those directories too."
> implies one must follow the rsync pattern elsewhere in 10.F, which isn't 
> really
> true. Maybe re-wording it to:
> "If you have relocated pg_wal outside the data directories you must copy it
> over to the new standby, and ensure the symlink from $PGDATA points to it"
> helps?

We can't document every possible configuration, especially if a
secondary tool is used in the middle.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] src/test/subscription/t/002_types.pl hanging on particular environment

2017-09-18 Thread Tom Lane

Thomas Munro  writes:
> In this build you can see the output of the following at the end,
> which might provide clues to the initiated.  You might need to click a
> small triangle to unfold the commands' output.

> cat ./src/test/subscription/tmp_check/log/002_types_publisher.log
> cat ./src/test/subscription/tmp_check/log/002_types_subscriber.log
> cat ./src/test/subscription/tmp_check/log/regress_log_002_types

The subscriber log includes
2017-09-18 08:43:08.240 UTC [15672] WARNING:  out of background worker slots
2017-09-18 08:43:08.240 UTC [15672] HINT:  You might need to increase 
max_worker_processes.

Maybe that's harmless, but I'm suspicious that it's a smoking gun.
I think perhaps this reflects a failed attempt to launch a worker,
which the caller does not realize has failed to launch because of the
lack of worker-fork-failure error recovery I bitched about months ago
[1], leading to subscription startup waiting forever for a worker that's
never going to report finishing.

I see Amit K. just posted a patch in that area [2], haven't looked at it
yet.

regards, tom lane

[1] https://www.postgresql.org/message-id/4905.1492813...@sss.pgh.pa.us
[2] 
https://www.postgresql.org/message-id/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3...@mail.gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional contrib test suites

2017-09-18 Thread Peter Eisentraut

On 9/16/17 08:10, David Steele wrote:
>>> (5) drop contrib/chkpass altogether, on the grounds that it's too badly
>>> designed, and too obsolete crypto-wise, to be useful or supportable.
>> crypt() uses the 7 lowest characters, which makes for 7.2e16 values,
>> so I would be fine with (5), then (4) as the test suite is not
>> portable.
> I'd prefer 5, but can go with 4.
> 
> I get that users need to store their own passwords, but we have support
> for SHA1 via the crypto module which seems by far the better choice.

I'm also tempted to just remove it.  It uses bad/outdated security
practices and it's also not ideal as an example module.  Any objections?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting query on crash even if completed

2017-09-18 Thread Tom Lane

Andres Freund  writes:
> When a backend dies, in a manner triggering a crash restart, we
> currently log something like:
> DETAIL: Failed process was running: %s
> 
> That used to be only when there's an active query, but since
> commit 4f42b546fd87a80be30c53a0f2c897acb826ad52
> that's not the case anymore. I can't recall anybody complaining, but to
> me it seems fairly confusing to report that some query was running when
> it's actually not.

I think this is fine.  If the backend crashes in nominally-post-query
mopup, it might be useful to know what it had been running.  If it
really was idle, it seems no more confusing than in pg_stat_activity.

Now, for pg_stat_activity part of the argument why this wouldn't be
confusing was that you could also see the "state" field.  Maybe we
should try to shoehorn equivalent info into the crash log entry?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-09-18 Thread Jan Michálek

Thanks

2017-09-18 15:33 GMT+02:00 Daniel Gustafsson :

> > On 18 Sep 2017, at 15:31, Jan Michálek  wrote:
> >
> > I have`t any new version recently, because i was waiting for new
> version. But I will have some time on this in next months. How many time I
> have to next freeze?
>
> The next commitfest runs from 2017-11-01 to 2017-11-30, I will move your
> patch
> there.
>
> cheers ./daniel




-- 
Jelen
Starší čeledín datovýho chlíva

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-09-18 Thread Daniel Gustafsson

> On 18 Sep 2017, at 15:31, Jan Michálek  wrote:
> 
> I have`t any new version recently, because i was waiting for new version. But 
> I will have some time on this in next months. How many time I have to next 
> freeze?

The next commitfest runs from 2017-11-01 to 2017-11-30, I will move your patch
there.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-09-18 Thread Jan Michálek

I have`t any new version recently, because i was waiting for new version.
But I will have some time on this in next months. How many time I have to
next freeze?

Thanks Je;

2017-09-13 1:14 GMT+02:00 Daniel Gustafsson :

> > On 08 May 2017, at 12:02, Fabien COELHO  paristech.fr> wrote:
> >
> > Hello Jan,
> >
> > Please give a number to submitted patches. I think that this was v3.
> >
> > The patch does NOT fix various issues I pointed out in my previous
> review:
> >
> > - tabs introduced in "doc/src/sgml/ref/psql-ref.sgml"
> > - too long help line in "src/bin/psql/help.c"
> > - spurious space after a comma in "src/fe_utils/print.c"
> >   and possibly elsewhere.
> >
> > On Sun, 23 Apr 2017, Jan Michálek wrote:
> >
>  Markdown include characters/sequences which are interpreted as
> markers:
>  _Italic_, **Bold**, *** => horizontal rules, > block quote... `inline
>  code`... If they are interpreted within a table cell then probably
> they
>  should be escaped somehow.
> >>
> >> I have treated "_*|<>"
> >
> > Probably not enough, see below. Note the escaping chars should also be
> escaped.
> >
> >>> I`m able to sanitize characters, but complex sequences will be
> problem. I
> >>> will look on this, but I don`t know, if I`m able to do this.
> >
> > I do not know whether only those are necessary. Have you checked?
> Guessing is probably not the right approach.
> >
> >
> > Concerning MARKDOWN, and according to the following source about github
> markdown implementation:
> >
> >   https://enterprise.github.com/downloads/en/markdown-cheatsheet.pdf
> >
> > The following characters may need to be backslash escaped, although it
> does not cover HTML stuff.
> >
> >   \   backslash
> >   `   backtick
> >   *   asterisk
> >   _   underscore
> >   {}  curly braces
> >   []  square brackets
> >   ()  parentheses
> >   #   hash mark
> >   +   plus sign
> >   -   minus sign (hyphen)
> >   .   dot
> >   !   exclamation
> >
> > Another source https://genius.com/3057216 suggests (* # / ( ) [ ] < >),
> > which should protect HTML.
> >
> > However, the escaping seems to be the backslash character, NOT using
> html encoding  as done in your version.
> >
> > Where did you find the precise escaping rules for markdown? I do not
> think that it should be invented...
> >
> >
> > I have looked at RST, according to this reference:
> >
> >   http://docutils.sourceforge.net/docs/ref/rst/
> restructuredtext.html#grid-tables
> >
> > The good news is that you do not need to handle a special | case because
> you would only produce clean tables.
> >
> > I've tested UTF-8 with plane 1 (你好!) and plane 2 (!) and the alignment
> seems to worked well, incredible!
> >
> >>> My main interest on this was in rst. I`m using markdown only in github
> issues and my knowldge about md is poor.
> >
> > Then maybe only do RST?!
> >
> > It looks much simpler anyway, and if you do MARKDOWN the support needs
> to be clean.
> >
> > About the code:
> >
> > I'm still at odds with the code which needs to test for markdown to call
> for different functions in multiple places. If you keep md and in order to
> avoid that, I would suggest to extend the pg_wcs* functions with a list of
> caracters which may have different sizes with additionnal args, say:
> >
> >  pg_wcssize(// same args, plus:
> > char * escaped_chars, // will require escaping
> > int escape_len, // how many chars added when escaping
> > int nllen // len of newline if substituted
> > );
> >
> > So that pg_wcssize(..., "\r", 1, 1) would behave as before (\n and \t
> are rather special cases), and the various constants could be held in the
> format description so the whole thing would be parametric.
> >
> > Same approach with format.
> >
> > That would allow to simplify the code significantly and to share it
> between MARKDOWN and others. Also, the multiple "else if" list would be
> simplified by using strchr or the escaped_chars string.
>
> This patch was moved into the current Commitfest marked “Waiting for
> author”
> with the above review.  Have you had a chance to work on it addressing the
> review comments such that we can expect a new version within this CF?
>
> cheers ./daniel




-- 
Jelen
Starší čeledín datovýho chlíva

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-18 Thread Nikita Glukhov


On 18.09.2017 00:38, Alvaro Herrera wrote:


Nikita Glukhov wrote:


0007-json_table-v02.patch:
  - JSON_TABLE using XMLTABLE infrastructure

0008-json_table-json-v02.patch:
  - JSON_TABLE support for json type

I'm confused ... why are these two patches and not a single one?


As I sad before, json support in jsonpath looks a bit dubious to me.  So if
patch no. 2 will not be accepted, then patches no. 4, 6, 8 should also be
simply skipped.  But, of course, patches 1 and 2, 3 and 4, 5 and 6, 7 and 8
can be combined.

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control_recovery() return value when not in recovery

2017-09-18 Thread Joe Conway

Sorry for the top post. Sounds reasonable to me. Cannot look closely until 
Tuesday or so.

Joe


On September 17, 2017 11:29:32 PM PDT, Andres Freund  wrote:
>On 2017-09-18 07:24:43 +0100, Simon Riggs wrote:
>> On 18 September 2017 at 05:50, Andres Freund 
>wrote:
>> > Hi,
>> >
>> > Just noticed that we're returning the underlying values for
>> > pg_control_recovery() without any checks:
>> > postgres[14388][1]=# SELECT * FROM pg_control_recovery();
>> >
>┌──┬───┬──┬┬───┐
>> > │ min_recovery_end_lsn │ min_recovery_end_timeline │
>backup_start_lsn │ backup_end_lsn │ end_of_backup_record_required │
>> >
>├──┼───┼──┼┼───┤
>> > │ 0/0  │ 0 │ 0/0   
>  │ 0/0│ f │
>> >
>└──┴───┴──┴┴───┘
>> > (1 row)
>> 
>> Yes, that would have made sense for these to be NULL
>
>Yea, that's what I think was well.  Joe, IIRC that's your code, do you
>agree as well?
>
>
>> > postgres[14388][1]=# SELECT pg_is_in_recovery();
>> > ┌───┐
>> > │ pg_is_in_recovery │
>> > ├───┤
>> > │ f │
>> > └───┘
>> > (1 row)
>> 
>> But not this, since it is a boolean and the answer is known.
>
>Oh, that was just for reference, to show that the cluster isn't in
>recovery...
>
>
>- Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

[HACKERS] Re: Error: dsa_area could not attach to a segment that has been freed

2017-09-18 Thread Gaddam Sai Ram

Kindly help me with the above thread..


Thanks
G. Sai Ram







 On Fri, 15 Sep 2017 13:21:33 +0530 Gaddam Sai Ram 
gaddamsaira...@zohocorp.com wrote 




Hello everyone, 



Based on the discussion in the below thread, I built a an extension using 
DSA(postgres-10 beta-3, linux machine).

https://www.postgresql.org/message-id/flat/15cc91e836b.12362f6336537.3402818166899386386%40zohocorp.com#15cc91e836b.12362f6336537.3402818166899386...@zohocorp.com





Use _PG_init and the shmem hook to reserve a little bit of



traditional shared memory and initialise it to zero.  This will be

used just to share the DSA handle, but you can't actually create the

DSA area in postmaster.  In other words, this little bit of shared

memory is for "discovery", since it can be looked up by name from any

backend.




Yes, I have created memory for DSA handle in shared memory, but not the actual 
DSA area.



In each backend that wants to use your new in-memory index system,

you need to be able to attach or create the DSA area on-demand.

Perhaps you could have a get_my_shared_state() function (insert better

name) that uses a static local variable to hold a pointer to some

state.  If it's NULL, you know you need to create the state.  That

should happen only once in each backend, the first time through the

function.  In that case you need to create or attach to the DSA area

as appropriate, which you should wrap in

LWLockAcquire(AddinShmemInitLock,

LW_EXCLUSIVE)/LWLockRelease(AddinShmemInitLock) to serialise the code

block.  First, look up the bit of traditional shared memory to see if

there is a DSA handle published in it already.  If there is you can

attach.  If there isn't, you are the first so you need to create, and

publish the handle for others to attach to.  Remember whatever state

you need to remember, such as the dsa_area, in static local variables

so that all future calls to get_my_shared_state() in that backend will

be fast.


Yes, the code is present in gstore_shmem.c(pfa) and the first process to use 
DSA will create the area, and rest all new processes will either attach it or 
if it is already attached, it will use the DSA area which is already pinned.





== I have created a bgworker in pg_init and when it starts it will be the 
first process to access DSA, so it will create DSA area.

== I have a small UDF function(simple_udf_func) which I call in a new 
backend(process). So it will attach the DSA area already created.

== When I make a call to same UDF function again in the same process, since 
the area is already attached and pinned, I use the same area which I store in a 
global variable while attaching/creating. Here I get the problem...



Error details: dsa_area could not attach to a segment that has been freed



While examining in detail, I found this data.

I used dsa_dump() for debugging and I found that during my error case, i get 
this log:



dsa_area handle 1:

  max_total_segment_size: 0

  total_segment_size: 0

  refcnt: 0

  pinned: f

  segment bins:

  segment bin 0 (at least -2147483648 contiguous pages free):





Clearly, the data in my DSA area has been corrupted in latter case, but my 
bgworker continues to work proper with same dsa_area handle.



At this stage, the dsa_dump() in my bgworker is as below:





dsa_area handle 1814e630:





  max_total_segment_size: 18446744073709551615

  total_segment_size: 1048576

  refcnt: 3

  pinned: t

  segment bins:

segment bin 8 (at least 128 contiguous pages free):

  segment index 0, usable_pages = 253, contiguous_pages = 220, mapped at 
0x7f0abbd58000



As i'm pinning the dsa mapping after attach, it has to stay through out the 
backend session. But not sure why its freed/corrupted.





Kindly help me in fixing this issue. Attached the copy of my extension, which 
will reproduce the same issue. 





Regards

G. Sai Ram

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-18 Thread Ashutosh Bapat

On Sat, Sep 16, 2017 at 2:53 AM, Robert Haas  wrote:
> On Fri, Sep 15, 2017 at 6:11 AM, Ashutosh Bapat
>  wrote:
>> Thanks a lot Robert.
>>
>> Here are rebased patches.
>
> I didn't get quite as much time to look at these today as I would have
> liked, but here's what I've got so far.
>
> Comments on 0001:
>
> - In the RelOptInfo, part_oids is defined in a completely different
> part of the structure than nparts, but you can't use it without nparts
> because you don't know how long it is.  I suggest moving the
> definition to just after nparts.
>
> - On the other hand, maybe we should just remove it completely.  I
> don't see it in any of the subsequent patches.  If it's used by the
> advanced matching code, let's leave it out of 0001 for now and add it
> back after the basic feature is committed.

No, it's not used by advanced partition matching code. It was used by
to match OIDs with the child rels to order those in the array. But now
that we are expanding in EIBO fashion, it is not useful. Should have
removed it earlier. Removed now.

>
> - Similarly, partsupfunc isn't used by the later patches either.  It
> seems it could also be removed, at least for now.

It's used by advanced partition matching code to compare bounds. It
will be required by partition pruning patch. But removed for now.

>
> - The comment for partexprs isn't very clear about how the lists
> inside the array work.  My understanding is that the lists have as
> many members as the partition key has columns/expressions.

Actually we are doing some preparation for partition-wise join here.
partexprs and nullable_partexprs are used in partition-wise join
implementation patch. I have updated prologue of RelOptInfo structure
with the comments like below

 * Note: A base relation will always have only one set of partition keys. But a
 * join relation has as many sets of partition keys as the number of joining
 * relations. The number of partition keys is given by
 * "part_scheme->partnatts". "partexprs" and "nullable_partexprs" are arrays
 * containing part_scheme->partnatts elements. Each element of the array
 * contains a list of partition key expressions. For a base relation each list
 * contains only one expression.  For a join relation each list contains at
 * most as many expressions as the joining relations. The expressions in a list
 * at a given position in the array correspond to the partition key at that
 * position. "partexprs" contains partition keys of non-nullable joining
 * relations and "nullable_partexprs" contains partition keys of nullable
 * joining relations. For a base relation only "partexprs" is populated.

Let me know this looks fine. The logic to match the partition keys of
joining relations in have_partkey_equi_join() and
match_expr_to_partition_keys() becomes simpler if we arrange the
partition key expressions as array indexed by position of partition
key and each array element as list of partition key expressions at
that position.

partition pruning might need partexprs look up relevant quals, but
nullable_partexprs doesn't have any use there. So may be we should add
nullable_partexpr to RelOptInfo as part of 0002 (partition-wise join
implementation) instead of 0001. What do you think?

>
> - I'm not entirely sure whether maintaining partexprs and
> nullable_partexprs is the right design.  If I understand correctly,
> whether or not a partexpr is nullable is really a per-RTI property,
> not a per-expression property.  You could consider something like
> "Relids nullable_rels".

That's true. However in order to decide whether an expression falls on
nullable side of a join, we will need to call pull_varnos() on it and
check the output against nullable_rels. Separating the expressions
themselves avoids that step.

>
> Comments on 0002:
>
> - The relationship between deciding to set partition scheme and
> related details and the configured value of enable_partition_wise_join
> needs some consideration.  If the only use of the partition scheme is
> partition-wise join, there's no point in setting it even for a baserel
> unless enable_partition_wise_join is set -- but if there are other
> important uses for that data, such as Amit's partition pruning work,
> then we might want to always set it.  And similarly for a join: if the
> details are only needed in the partition-wise join case, then we only
> need to set them in that case, but if there are other uses, then it's
> different.  If it turns out that setting these details for a baserel
> is useful in other cases but that it's only for a joinrel in the
> partition-wise join case, then the patch gets it exactly right.  But
> is that correct?  I'm not sure.

Partition scheme contains the information about data types of
partition keys, which is required to compare partition bounds.
Partition pruning will need to compare constants with partition bounds
and hence will need information contained in partition scheme.

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Robert Haas

On Mon, Sep 18, 2017 at 7:17 AM, Andres Freund  wrote:
> Private:

Not so much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Michael Paquier

On Mon, Sep 18, 2017 at 8:17 PM, Andres Freund  wrote:
> And now you missed the "same infrastructure" part.

I am sort of aware of that per the list of parameters at the top
fe-connect.c, thanks for the reminer :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Andres Freund



On September 18, 2017 4:15:31 AM PDT, Michael Paquier 
 wrote:
>On Mon, Sep 18, 2017 at 8:09 PM, Andres Freund 
>wrote:
>>
>>
>> On September 18, 2017 4:08:21 AM PDT, Michael Paquier
> wrote:
>>>On Mon, Sep 18, 2017 at 7:54 PM, Andres Freund 
>>>wrote:
>It seems to me that you are looking more for a connection parameter
>here.

 I'm not seeing a meaningful distinction here? Env vars and
>connection
>>>parameters are handled using the same framework in libpq.  And using
>>>the env var in the test would be better, because you'd only set one
>>>value - hard to do within our non TAP tests (i.e. in an existing
>psql,
>>>started by pg regress) otherwise.
>>>
>>>Or both? I don't really understand why an environment variable is
>>>better than a connection string. For the TAP tests, you could just
>set
>>>the base of the connection string once and you are done as well. See
>>>the SSL tests for example.
>>
>> Did you read what I wrote?
>
>Sorry, I missed the "non" with "TAP" tests. Having a connection
>parameter would still be low-cost in maintenance, so if you add that
>at the same time I won't complain.

Private:

And now you missed the "same infrastructure" part.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Michael Paquier

On Mon, Sep 18, 2017 at 8:09 PM, Andres Freund  wrote:
>
>
> On September 18, 2017 4:08:21 AM PDT, Michael Paquier 
>  wrote:
>>On Mon, Sep 18, 2017 at 7:54 PM, Andres Freund 
>>wrote:
It seems to me that you are looking more for a connection parameter
here.
>>>
>>> I'm not seeing a meaningful distinction here? Env vars and connection
>>parameters are handled using the same framework in libpq.  And using
>>the env var in the test would be better, because you'd only set one
>>value - hard to do within our non TAP tests (i.e. in an existing psql,
>>started by pg regress) otherwise.
>>
>>Or both? I don't really understand why an environment variable is
>>better than a connection string. For the TAP tests, you could just set
>>the base of the connection string once and you are done as well. See
>>the SSL tests for example.
>
> Did you read what I wrote?

Sorry, I missed the "non" with "TAP" tests. Having a connection
parameter would still be low-cost in maintenance, so if you add that
at the same time I won't complain.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Andres Freund



On September 18, 2017 4:08:21 AM PDT, Michael Paquier 
 wrote:
>On Mon, Sep 18, 2017 at 7:54 PM, Andres Freund 
>wrote:
>>>It seems to me that you are looking more for a connection parameter
>>>here.
>>
>> I'm not seeing a meaningful distinction here? Env vars and connection
>parameters are handled using the same framework in libpq.  And using
>the env var in the test would be better, because you'd only set one
>value - hard to do within our non TAP tests (i.e. in an existing psql,
>started by pg regress) otherwise.
>
>Or both? I don't really understand why an environment variable is
>better than a connection string. For the TAP tests, you could just set
>the base of the connection string once and you are done as well. See
>the SSL tests for example.

Did you read what I wrote?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Michael Paquier

On Mon, Sep 18, 2017 at 7:54 PM, Andres Freund  wrote:
>>It seems to me that you are looking more for a connection parameter
>>here.
>
> I'm not seeing a meaningful distinction here? Env vars and connection 
> parameters are handled using the same framework in libpq.  And using the env 
> var in the test would be better, because you'd only set one value - hard to 
> do within our non TAP tests (i.e. in an existing psql, started by pg regress) 
> otherwise.

Or both? I don't really understand why an environment variable is
better than a connection string. For the TAP tests, you could just set
the base of the connection string once and you are done as well. See
the SSL tests for example.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GatherMerge misses to push target list

2017-09-18 Thread Rushabh Lathia

On Mon, Sep 18, 2017 at 2:48 PM, Amit Kapila 
wrote:

> On Mon, Sep 18, 2017 at 12:46 PM, Rushabh Lathia
>  wrote:
> > Thanks Amit for the patch.
> >
> > I reviewed the code changes as well as performed more testing. Patch
> > looks good to me.
> >
>
> In that case, can you please mark the patch [1] as ready for committer in
> CF app
>

Done.


>
> > Here is the updated patch - where added test-case clean up.
> >
>
> oops, missed dropping the function.
>
>
> Thanks for the review.
>
>
> [1] - https://commitfest.postgresql.org/15/1293/
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>



-- 
Rushabh Lathia

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Andres Freund



On September 18, 2017 3:50:17 AM PDT, Michael Paquier 
 wrote:
>On Mon, Sep 18, 2017 at 6:53 PM, Andres Freund 
>wrote:
>> On 2017-09-13 23:39:21 -0400, Tom Lane wrote:
>>> The real problem in this area, to my mind, is that we're not testing
>that
>>> code --- either end of it --- in any systematic way.  If it's broken
>it
>>> could take us quite a while to notice.
>>
>> Independent of the thrust of my question - why aren't we adding a
>> 'force-v2' option to libpq?  A test that basically does something
>like
>> postgres[22923][1]=# \setenv PGFORCEV2 1
>> postgres[22923][1]=# \c
>> You are now connected to database "postgres" as user "andres".
>> postgres[22924][1]=>
>> seems easy enough to add, in fact I tested the above.
>>
>> And the protocol coverage of the v2 protocol seems small enough that
>a
>> single not too large file ought to cover most if it quite easily.
>
>It seems to me that you are looking more for a connection parameter
>here.

I'm not seeing a meaningful distinction here? Env vars and connection 
parameters are handled using the same framework in libpq.  And using the env 
var in the test would be better, because you'd only set one value - hard to do 
within our non TAP tests (i.e. in an existing psql, started by pg regress) 
otherwise.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Michael Paquier

On Mon, Sep 18, 2017 at 6:53 PM, Andres Freund  wrote:
> On 2017-09-13 23:39:21 -0400, Tom Lane wrote:
>> The real problem in this area, to my mind, is that we're not testing that
>> code --- either end of it --- in any systematic way.  If it's broken it
>> could take us quite a while to notice.
>
> Independent of the thrust of my question - why aren't we adding a
> 'force-v2' option to libpq?  A test that basically does something like
> postgres[22923][1]=# \setenv PGFORCEV2 1
> postgres[22923][1]=# \c
> You are now connected to database "postgres" as user "andres".
> postgres[22924][1]=>
> seems easy enough to add, in fact I tested the above.
>
> And the protocol coverage of the v2 protocol seems small enough that a
> single not too large file ought to cover most if it quite easily.

It seems to me that you are looking more for a connection parameter here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-18 Thread Michael Paquier

On Mon, Sep 18, 2017 at 11:16 AM, Amit Kapila  wrote:
> On Sat, Sep 16, 2017 at 7:15 PM, Michael Paquier
>  wrote:
>> On Fri, Sep 15, 2017 at 4:22 PM, Amit Langote
>>  wrote:
>>> On 2017/09/14 16:00, Michael Paquier wrote:
 On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote
  wrote:
> Sure, no problem.

 OK, I took a closer look at all patches, but did not run any manual
 tests to test the compression except some stuff with
 wal_consistency_checking.
>>>
>>> Thanks for the review.
>>>
 For SpGist, I think that there are two missing: spgbuild() and 
 spgGetCache().
>>>
>>> spgbuild() calls SpGistInitMetapage() before marking the metapage buffer
>>> dirty.  The latter already sets pd_lower correctly, so we don't need to do
>>> it explicitly in spgbuild() itself.
>>
>> Check.
>>
>>> spgGetCache() doesn't write the metapage, only reads it:
>>>
>>> /* Last, get the lastUsedPages data from the metapage */
>>> metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
>>> LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
>>>
>>> metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
>>>
>>> if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
>>> elog(ERROR, "index \"%s\" is not an SP-GiST index",
>>>  RelationGetRelationName(index));
>>>
>>> cache->lastUsedPages = metadata->lastUsedPages;
>>>
>>> UnlockReleaseBuffer(metabuffer);
>>>
>>> So, I think it won't be correct to set pd_lower here, no?
>>
>> Yeah, I am just reading the code again and there is no alarm here.
>>
>>> Updated patch attached, which implements your suggested changes to the
>>> masking functions.
>>>
>>> By the way, as I noted on another unrelated thread, I will not be able to
>>> respond to emails from tomorrow until 9/23.
>>
>> No problem. Enjoy your vacations.
>>
>> I have spent some time looking at the XLOG insert code, and tested the
>> compressibility of the meta pages. And I have noticed that all pages
>> registered with REGBUF_WILL_INIT will force XLogRecordAssemble to not
>> take a FPW of the block registered because the page will be
>> reinitialized at replay, and in such cases the zero'ed page is filled
>> with the data from the record. log_newpage is used to initialize init
>> forks for unlogged relations, which is where this patch allows page
>> compression to take effect correctly. Still setting REGBUF_STANDARD
>> with REGBUF_WILL_INIT is actually a no-op, except if
>> wal_checking_consistency is used when registering a buffer for WAL
>> insertion. There is one code path though where things are useful all
>> the time: revmap_physical_extend for BRIN.
>>
>> The patch is using the correct logic, still such comments are in my
>> opinion incorrect because of the reason written above:
>> +* This won't be of any help unless we use option like REGBUF_STANDARD
>> +* while registering the buffer for metapage as otherwise, it won't try 
>> to
>> +* remove the hole while logging the full page image.
>> Here REGBUF_STANDARD is actually a no-op for btree.
>>
>
> You have already noticed above that it will help when
> wal_checking_consistency is used and that was the main motivation to
> pass REGBUF_STANDARD apart from maintaining consistency.  It is not
> clear to me what is bothering you.  If your only worry about these
> patches is that you want this sentence to be removed from the comment
> because you think it is obvious or doesn't make much sense, then I
> think we can leave this decision to committer.  I have added it based
> on Tom's suggestion above thread about explaining why it is
> inessential or essential to set pd_lower.  I think Amit Langote just
> tried to mimic what I have done in hash and btree patches to maintain
> consistency.  I am also not very sure if we should write some detailed
> comment or leave the existing comment as it is.  I think it is just a
> matter of different perspective.

What is disturbing me a bit is that the existing comments mention
something that could be supported (the compression of pages), but
that's actually not done and this is unlikely to happen because the
number of bytes associated to a meta page is going to be always
cheaper than a FPW, which would cost in CPU to store it for
compression is enabled. So I think that we should switch comments to
mention that pd_lower is set so as this helps with page masking, but
we don't take advantage of XLOG compression in the code. I agree that
this is a minor point, so if the wave of this thread is that I am too
noisy, please feel free to ignore me: the logic of the patches is
still correct, still having those comments feels like cheating a bit
;p
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.

2017-09-18 Thread Andres Freund

On 2017-09-17 01:07:52 -0700, Andres Freund wrote:
> On 2017-09-16 13:27:05 -0700, Andres Freund wrote:
> > > This does not seem like a problem that justifies a system-wide change
> > > that's much more delicate than you thought.
> > 
> > We need one more initialization call during crash-restart - that doesn't
> > seem particularly hard a fix.
> 
> FWIW, attached is that simple fix. Not quite ready for commit - needs
> more comments, thoughts and a glass of wine less to commit.
> 
> I'll try to come up with a tap test that tests crash restarts tomorrow -
> not sure if there's a decent way to trigger one on windows without
> writing a C function. Will play around with that tomorrow.

This took a bit longer than I anticipated. Attached.  There's
surprisingly many issues with timing that make this not as easy as I
hoped.  I think in later commits we should extend this to cover things
like the autovacuum / stats process / ... being killed - but that seems
like material for a separate commit.

One thing that I've noticed for a while, but that I was reminded of
again here. We very frequently allow psql to reconnect in case of crash,
just for postmaster to notice a child has gone and kill that session. I
don't recall that frequently happening, but these days it happens nearly
every time.

Regards,

Andres
>From d7cbdc120816410335d0da12197c31192657ca42 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sun, 17 Sep 2017 23:05:58 -0700
Subject: [PATCH] Add tests for postmaster crash restarts.

Given that I managed to break this...

Author: Andres Freund
Discussion: https://postgr.es/m/20170917080752.rcmihzfmgbeuq...@alap3.anarazel.de
---
 src/test/recovery/t/013_crash_restart.pl | 192 +++
 1 file changed, 192 insertions(+)
 create mode 100644 src/test/recovery/t/013_crash_restart.pl

diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
new file mode 100644
index 00..e8ad24941b
--- /dev/null
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -0,0 +1,192 @@
+#
+# Tests restarts of postgres due to crashes of a subprocess.
+#
+# Two longer-running psql subprocesses are used: One to kill a
+# backend, triggering a crash-restart cycle, one to detect when
+# postmaster noticed the backend died.  The second backend is
+# necessary because it's otherwise hard to determine if postmaster is
+# still accepting new sessions (because it hasn't noticed that the
+# backend died), or because it's already restarted.
+#
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More;
+use Config;
+use Time::HiRes qw(usleep);
+
+if ($Config{osname} eq 'MSWin32')
+{
+	# some Windows Perls at least don't like IPC::Run's
+	# start/kill_kill regime.
+	plan skip_all => "Test fails on Windows perl";
+}
+else
+{
+	plan tests => 12;
+}
+
+my $node = get_new_node('master');
+$node->init(allows_streaming => 1);
+$node->start();
+
+# by default PostgresNode doesn't doesn't restart after a crash
+$node->safe_psql('postgres',
+ q[ALTER SYSTEM SET restart_after_crash = 1;
+   ALTER SYSTEM SET log_connections = 1;
+   SELECT pg_reload_conf();]);
+
+# Run psql, keeping session alive, so we have an alive backend to kill.
+my ($killme_stdin, $killme_stdout, $killme_stderr) = ('', '', '');
+my $killme = IPC::Run::start(
+	[   'psql', '-X', '-qAt', '-v', 'ON_ERROR_STOP=1', '-f', '-', '-d',
+		$node->connstr('postgres') ],
+	'<',
+	\$killme_stdin,
+	'>',
+	\$killme_stdout,
+	'2>',
+	\$killme_stderr);
+
+# Need a second psql to check if crash-restart happened.
+my ($monitor_stdin, $monitor_stdout, $monitor_stderr) = ('', '', '');
+my $monitor = IPC::Run::start(
+	[   'psql', '-X', '-qAt', '-v', 'ON_ERROR_STOP=1', '-f', '-', '-d',
+		$node->connstr('postgres') ],
+	'<',
+	\$monitor_stdin,
+	'>',
+	\$monitor_stdout,
+	'2>',
+	\$monitor_stderr);
+
+#create table, insert row that should survive
+$killme_stdin .= q[
+CREATE TABLE alive(status text);
+INSERT INTO alive VALUES($$committed-before-sigquit$$);
+SELECT pg_backend_pid();
+];
+$killme->pump until $killme_stdout =~ /[[:digit:]]+[\r\n]$/;
+my $pid = $killme_stdout;
+chomp($pid);
+$killme_stdout = '';
+
+#insert a row that should *not* survive, due to in-progress xact
+$killme_stdin .= q[
+BEGIN;
+INSERT INTO alive VALUES($$in-progress-before-sigquit$$) RETURNING status;
+];
+$killme->pump until $killme_stdout =~ /in-progress-before-sigquit/;
+$killme_stdout = '';
+
+
+# Start longrunning query in second session, it's failure will signal
+# that crash-restart has occurred.
+$monitor_stdin .= q[
+SELECT pg_sleep(3600);
+];
+$monitor->pump;
+
+
+# kill once with QUIT - we expect psql to exit, while emitting error message first
+my $cnt = kill 'QUIT', $pid;
+
+# Exactly process should have been alive to be killed
+is($cnt, 1, "exactly one process killed with SIGQUIT");
+
+# Check that psql sees the killed backend  as having been terminated
+$killme_stdin .= q[
+SELECT 1;
+];

Re: [HACKERS] [PATCH] Generic type subscripting

2017-09-18 Thread Dmitry Dolgov

> On 18 September 2017 at 11:39, Arthur Zakirov 
wrote:
> I think it would be good to add new catalog table. It may be named as
pg_type_sbs or pg_subscripting (second is better I think).
> This table may have the fields:
> - oid
> - sbstype
> - sbsinit
> - sbsfetch
> - sbsassign

What is `sbstype`?

> And command 'CREATE SUBSCRIPTING' should add an entry to the
pg_subscripting table. It also adds entries to the pg_depend table:
dependency between pg_type and pg_subscripting, dependency between pg_type
and pg_proc.
> 'DROP SUBSCRIPTING' should drop this entries, it should not drop init
function.

Why we should keep those subscripting functions? From my understanding
they're
totally internal and useless without subscripting context.

Overall I have only one concern about this suggestion - basically it changes
nothing from the perspective of functionality or implementation quality. The
only purpose of it is to make a `subscripting` entity more explicit at the
expense of overhead of having one more catalog table and little bit more
complexity. I'm not really sure if it's necessary or not, and I would
appreciate any commentaries about that topic from the community (to make me
more convinced that this is a good decision or not).

Re: [HACKERS] src/test/subscription/t/002_types.pl hanging on particular environment

2017-09-18 Thread Andres Freund

Hi,

On 2017-09-18 21:57:04 +1200, Thomas Munro wrote:
> The subscription tests 002_types.pl sometimes hangs for a while and
> then times out when run on a Travis CI build VM running Ubuntu Trusty
> if --enable-coverage is used.

Yea, I saw that too.

> I guess it might be a timing/race
> problem because I can't think of any mechanism specific to coverage
> instrumentation except that it slows stuff down, but I don't know.
> The only other thing I can think of is that perhaps the instrumented
> code is writing something to stdout or stderr and that's finding its
> way into some protocol stream and confusing things, but I can't
> reproduce this on any of my usual development machines.  I haven't
> tried that exact operating system.  Maybe it's a bug in the toolchain,
> but that's an Ubuntu LTS release so I assume other people use it
> (though I don't see it in the buildfarm).

I've run this locally through a number of iterations with coverage
enabled, I think I reproduced it once, but unfortunately I'd continued
because I was working on something else at that moment.

It might be worthwhile to play around with replacing the or die in
my $synced_query =
"SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('s', 
'r');";
$node_subscriber->poll_query_until('postgres', $synced_query)
  or die "Timed out while waiting for subscriber to synchronize data";
with something like
or diag($node_subscriber->safe_psql('postgres', 'SELECT * FROM 
pg_subscription_rel')
just to know where to go from here.

> WARNING:  terminating connection because of crash of another server 
> process
> DETAIL:  The postmaster has commanded this server process to roll
> back the current transaction and exit, because another server process
> exited abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database
> and repeat your command.
> 
> As far as I know these are misleading, it's really just an immediate
> shutdown.  There is no core file, so I don't believe anything actually
> crashed.

I was about to complain about these, for entirely unrelated reasons. I
think it's a bad idea - and there's a couple complains on the lists too,
to emit these warnings.  It's not entirely trivial to fix though :(

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] src/test/subscription/t/002_types.pl hanging on particular environment

2017-09-18 Thread Thomas Munro

Hi,

The subscription tests 002_types.pl sometimes hangs for a while and
then times out when run on a Travis CI build VM running Ubuntu Trusty
if --enable-coverage is used.  I guess it might be a timing/race
problem because I can't think of any mechanism specific to coverage
instrumentation except that it slows stuff down, but I don't know.
The only other thing I can think of is that perhaps the instrumented
code is writing something to stdout or stderr and that's finding its
way into some protocol stream and confusing things, but I can't
reproduce this on any of my usual development machines.  I haven't
tried that exact operating system.  Maybe it's a bug in the toolchain,
but that's an Ubuntu LTS release so I assume other people use it
(though I don't see it in the buildfarm).

Example:

t/001_rep_changes.pl .. ok
t/002_types.pl  #
# Looks like your test exited with 29 before it could output anything.
t/002_types.pl  Dubious, test returned 29 (wstat 7424, 0x1d00)
Failed 3/3 subtests
t/003_constraints.pl .. ok
t/004_sync.pl . ok
t/005_encoding.pl . ok
t/007_ddl.pl .. ok
Test Summary Report
---
t/002_types.pl  (Wstat: 7424 Tests: 0 Failed: 0)
  Non-zero exit status: 29
  Parse errors: Bad plan.  You planned 3 tests but ran 0.

Before I figured out that --coverage was relevant, I wondered if the
recent commit 8edacab209957520423770851351ab4013cb0167 which landed
around the time I spotted this might have something to do with it, but
I tried reverting the code change in there and it didn't help.  Here's
a build log:

https://travis-ci.org/macdice/postgres/jobs/276752803

As you can see the script used was:

./configure --enable-debug --enable-cassert --enable-tap-tests
--enable-coverage && make -j4 all contrib && make -C
src/test/subscription check

In this build you can see the output of the following at the end,
which might provide clues to the initiated.  You might need to click a
small triangle to unfold the commands' output.

cat ./src/test/subscription/tmp_check/log/002_types_publisher.log
cat ./src/test/subscription/tmp_check/log/002_types_subscriber.log
cat ./src/test/subscription/tmp_check/log/regress_log_002_types

There are messages like this:

WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database
and repeat your command.

As far as I know these are misleading, it's really just an immediate
shutdown.  There is no core file, so I don't believe anything actually
crashed.

Here's a version that's the same except it doesn't have
--enable-coverage.  It passes:

https://travis-ci.org/macdice/postgres/jobs/276771654

Any ideas?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it time to kill support for very old servers?

2017-09-18 Thread Andres Freund

On 2017-09-13 23:39:21 -0400, Tom Lane wrote:
> The real problem in this area, to my mind, is that we're not testing that
> code --- either end of it --- in any systematic way.  If it's broken it
> could take us quite a while to notice.

Independent of the thrust of my question - why aren't we adding a
'force-v2' option to libpq?  A test that basically does something like
postgres[22923][1]=# \setenv PGFORCEV2 1
postgres[22923][1]=# \c
You are now connected to database "postgres" as user "andres".
postgres[22924][1]=>
seems easy enough to add, in fact I tested the above.

And the protocol coverage of the v2 protocol seems small enough that a
single not too large file ought to cover most if it quite easily.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Automatic testing of patches in commit fest

2017-09-18 Thread Aleksander Alekseev

Hi Thomas,

> 1.  I didn't have --enable-cassert enabled before.  Oops.
> 2.  It'll now dump a gdb backtrace for any core files found after a
> check-world failure (if you can find your way to the build log...).
> Thanks to Andres for the GDB scripting for this!
> 3.  It'll now push gcov results to codecov.io -- see link at top of
> page.  Thanks again to Andres for this idea.
> 4.  It now builds a little bit faster due to -j4 (Travis CI VMs have 2
> virtual cores) and .proverc -j3.  (So far one entry now fails in TAP
> tests with that setting, will wait longer before drawing any
> conclusions about that.)

Wow. Well done!

> LLVM sanitizer stuff

In my experience it doesn't work well with PostgreSQL code, way to many
false positives. For more details see this discussion [1].

> Valgrind

That would be great. Please note that Valgrind generates false reports
regarding memory leaks in PostgreSQL so you should use --leak-check=no.
Also USE_VALGRIND should be defined in src/include/pg_config_manual.h
before building PostgreSQL. Here [2] is a script I'm using.

> Coverity

I believe it would be not easy to integrate with the web-version of
Coverity. On the other hand Clang Static Analyzer often finds real
defects and it's very easy to integrate with it. Here is my script [3].

> more compilers, 

In my experience trying to compile a patch on FreeBSD with Clang often
helps to find some sort of defect. Same regarding trying different
architectures, e.g. x64, x86, arm.

[1]: https://www.postgresql.org/message-id/20160321130850.6ed6f598%40fujitsu
[2]: https://github.com/afiskon/pgscripts/blob/master/valgrind.sh
[3]: https://github.com/afiskon/pgscripts/blob/master/static-analysis.sh

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-18 Thread Alvaro Herrera

Pavel Stehule wrote:

> I am looking on man pagers - and there are very well readable
> 
> The rules are simply - when some variables are short - less than 6 chars,
> then it description and label are on same line. Between items are empty
> line

I was having a similar idea.  I'm not sure how many lines we would save
that way, but it seems to me we could use a slightly longer threshold --
say 8 chars before we cause a newline and wrap.

Maybe we need to come up with guidelines for translators, also.  As
somebody noted, I've always gone to some trouble to produce translate
output that preserves the original's constraints; I don't see any reason
forother translations to behave differently.

(By all means, if we add entry-separating newlines, let's not put them
as part of the translatable string.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we cacheline align PGXACT?

2017-09-18 Thread Daniel Gustafsson

> On 16 Sep 2017, at 01:51, Alexander Korotkov  
> wrote:
> 
> On Tue, Sep 5, 2017 at 2:47 PM, Daniel Gustafsson  > wrote:
> > On 04 Apr 2017, at 14:58, David Steele  > > wrote:
> >
> > On 4/4/17 8:55 AM, Alexander Korotkov wrote:
> >> On Mon, Apr 3, 2017 at 9:58 PM, Andres Freund  >> 
> >>
> >>I'm inclined to push this to the next CF, it seems we need a lot more
> >>benchmarking here.
> >>
> >> No objections.
> >
> > This submission has been moved to CF 2017-07.
> 
> This CF has now started (well, 201709 but that’s what was meant in above), can
> we reach closure on this patch in this CF?
> 
> During previous commitfest I come to doubts if this patch is really needed 
> when same effect could be achieved by another (perhaps random) change of 
> alignment.  The thing I can do now is to retry my benchmark on current master 
> and check what effect this patch has now.

Considering this I’ll mark this as Waiting on Author, in case you come to
conclusion that another patch is required then we’ll bump to a return status.

cheers ./daniel 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscripting

2017-09-18 Thread Arthur Zakirov

On Mon, Sep 18, 2017 at 10:31:54AM +0200, Dmitry Dolgov wrote:
> Just to clarify, do you mean that `CREATE SUBSCRIPTING FOR` would only make
> a
> dependency record? In this case `DROP SUBSCRIPTING FOR` actually means just
> drop an init function.

I think it would be good to add new catalog table. It may be named as 
pg_type_sbs or pg_subscripting (second is better I think).
This table may have the fields:
- oid
- sbstype
- sbsinit
- sbsfetch
- sbsassign

And command 'CREATE SUBSCRIPTING' should add an entry to the pg_subscripting 
table. It also adds entries to the pg_depend table: dependency between pg_type 
and pg_subscripting, dependency between pg_type and pg_proc.
'DROP SUBSCRIPTING' should drop this entries, it should not drop init function.

According to the Tom's comment the syntax can be modified in the following way:

CREATE SUBSCRIPTING FOR type_namei (
  INITFUNC = subscripting_init_func
  FETCHFUNC = subscripting_fetch_func
  ASSIGNFUNC = subscripting_assign_func
)
DROP SUBSCRIPTING FOR type_name

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GatherMerge misses to push target list

2017-09-18 Thread Amit Kapila

On Mon, Sep 18, 2017 at 12:46 PM, Rushabh Lathia
 wrote:
> Thanks Amit for the patch.
>
> I reviewed the code changes as well as performed more testing. Patch
> looks good to me.
>

In that case, can you please mark the patch [1] as ready for committer in CF app

> Here is the updated patch - where added test-case clean up.
>

oops, missed dropping the function.


Thanks for the review.


[1] - https://commitfest.postgresql.org/15/1293/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-18 Thread Rajkumar Raghuwanshi

On Mon, Sep 18, 2017 at 12:37 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke <
> jeevan.cha...@enterprisedb.com> wrote:
>
>>
>>
>> On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <
>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>
>>>
>>> Hi Jeevan,
>>>
>>> I have started testing partition-wise-aggregate and got one observation,
>>> please take a look.
>>> with the v2 patch, here if I change target list order, query is not
>>> picking full partition-wise-aggregate.
>>>
>>
>> Thanks Rajkumar for reporting this.
>>
>> I am looking into this issue and will post updated patch with the fix.
>>
>
> Logic for checking whether partition keys lead group by keys needs to be
> updated here. The group by expressions can appear in any order without
> affecting the final result. And thus, the need for partition keys should
> be leading the group by keys to have full aggregation is not mandatory.
> Instead we must ensure that the partition keys are part of the group by
> keys to compute full aggregation on a partition.
>
> Attached, revised patch-set with above fix.
>
> Also, in test-cases, I have removed DROP/ANALYZE commands on child
> relations and also removed VERBOSE from the EXPLAIN.
>
> Notes:
> HEAD: 8edacab209957520423770851351ab4013cb0167
> Partition-wise Join patch-set version: v32
>

Thanks for the patch. I have tested it and issue is fixed now.

[HACKERS] global indices

2017-09-18 Thread Rafal Pietrak

Hi,

I've just joined the list, as discussion on the [GENERAL] pointed me to
a thread here, which I'm very interested in.

But I'm not quite at "hacker" level, so pls forgive me if what I'll say
is naive for you here.

To the point. ... with a little introduction first:

On [GENERAL] I've been advised to change my schema from inheritance
hierarchy to a flat table  as I've explained, that I need an ID
that'll be unique across the entire hierarchy.

But this is impossible for me, because .



.the data I have there requires me to have two different indices (as
FK targets from elsewhere) over dissjoined parts of the dataset:
CREATE TABLE tt (ID int, A int, B int, c bool)
-- rows are like (1,1,1,true)
CREATE TABLE t1 (primary key (id,a), check (c=true)) INHERITS (tt);
-- here (1,1,1,true) must raise error when (1,1,2,true) is present
CREATE TABLE t2 (primary key (id,b), check (c=false)) INHERITS (tt);
-- here (1,1,1,false) must raise error when (1,2,1,false) is present
-- but if this two rows were 'true' instead of 'false', they are
-- entirely acceptable.

With this, I understand that I fall into the usage scenario you've
identified as "FK into the master table global index". Consequently, you
are planning to put "table object selector" into the "global index", so
that FK pointing to master table index could get redirected into
relevant subtable containing actual data.

I may be mistaken as I haven't followed (greed the archived) all your
discussion on this subject, but I haven't seen you discussing another
implementation - a functional index; something that want be as universal
as your plan, but (IMHO) can be much easier to implement.


Assuming using TT/T1/T2 tables example from above (based on my actual
data):

1. I use "FUNCTION myindex(id,a,b,c)" to create unique indexes on all
subtables.

2. It would be very hard for me to work out similar function that
returns !!Correctly unique!! values for every row in every subtable -
from the example above you can see, that "unique" by just concatenating
the fields is not correct in theis use case.

3. but (as per -1- above) it's doable and easy to make such function for
every subtable. It just concatenates som, and ignores other columns,
based on the "selector column".

So I would like to propose a mechanism for "global index", using
functional indes as:

A) have a FUNCTION, with argument list covering columns, that are used
in CHECK constraint partitioning the table.

B) allow for that function to be used throughout the entire hierarchy of
inheritance.

C) using values return by that function, create actual indeces only on
the LEAF subtables.

D) "register" that function - it's argument list in particular - at
every lever of inheritance hierarchy.

E) at every level of the inheritance hierarchy, to select appropriate
subcolumn, use only the CHECK constraints against relevant column being
an argument to the function being registered as functional-index there.

This way we get FK subtable selection is *identical* to that selection
on INSERT/UPDATE/DLETE/SELECT actions; we don't have any additional
disk-space consummed (needing to vacuum) by the "global index"; we don't
have to keep the "table object" withing the hierarchy indeces; and in
consequence, the drop of a subtable from the hierarchy is as chip as today.




In short, this proposal is:

1. creation of functional index on parent table of inheritance tree does
not actually create any (material) index

2. instead, it does install "processing hooks" and "FK virtual targets",
so that:
2.1 it is possible to create FK targeting it.
2.2 processing such dereference (resolution of the actual: subtable,
row) is based on CHECK constraints of subtables present.


-R


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving DISTINCT with LooseScan node

2017-09-18 Thread Dmitriy Sarafannikov


> It seems related to this thread? :
> https://www.postgresql.org/message-id/flat/5037A9C5.4030701%40optionshouse.com#5037a9c5.4030...@optionshouse.com
> 
> And this wiki page : https://wiki.postgresql.org/wiki/Loose_indexscan

Yep. Now i can see 2 use cases for this feature:
1. DISTINCT queries.
2. Effectively scanning multicolumn index if first column is omitted and has 
low cardinality 


> Not an answer to your question, but generally +1 for working on this
> area.  I did some early proto-hacking a while ago, which I haven't had
> time to get back to yet:
> 
> https://www.postgresql.org/message-id/flat/CADLWmXWALK8NPZqdnRQiPnrzAnic7NxYKynrkzO_vxYr8enWww%40mail.gmail.com
>  
> 
> 
> That was based on the idea that a DISTINCT scan using a btree index to
> skip ahead is always going to be using the leading N columns and
> always going to be covered by the index, so I might as well teach
> Index Only Scan how to do it directly rather than making a new node to
> sit on top.  As you can see in that thread I did start thinking about
> using a new node to sit on top and behave a bit like a nested loop for
> the more advanced feature of an Index Skip Scan (trying every value of
> (a) where you had an index on (a, b, c) but had a WHERE clause qual on
> (b, c)), but the only feedback I had so far was from Robert Haas who
> thought that too should probably be pushed into the index scan.

Thank you for information, i will look at this thread.

> FWIW I'd vote for 'skip' rather than 'loose' as a term to use in this
> family of related features (DISTINCT being the easiest to build).  It
> seems more descriptive than the MySQL term.  (DB2 added this a little
> while ago and went with 'index jump scan', suggesting that we should
> consider 'hop'... (weak humour, 'a hop, skip and a jump' being an
> English idiom meaning a short distance)).

Maybe skip would be better, but there will be no problems with something like 
patents?
I mean database which name beginning with big letter «O»? As i know, it has 
Index Skip Scan.

[HACKERS] Reporting query on crash even if completed

2017-09-18 Thread Andres Freund

Hi,

When a backend dies, in a manner triggering a crash restart, we
currently log something like:

LOG: %s (PID %d) was terminated by signal %d
LOG: %s (PID %d) exited with exit code %d
DETAIL: Failed process was running: %s

Notably we log the query from the backend whenever
pgstat_get_crashed_backend_activity() can find a query string.

That used to be only when there's an active query, but since
commit 4f42b546fd87a80be30c53a0f2c897acb826ad52
Author: Magnus Hagander 
Date:   2012-01-19 14:19:20 +0100

Separate state from query string in pg_stat_activity

This separates the state (running/idle/idleintransaction etc) into
it's own field ("state"), and leaves the query field containing just
query text.

The query text will now mean "current query" when a query is running
and "last query" in other states. Accordingly,the field has been
renamed from current_query to query.

Since backwards compatibility was broken anyway to make that, the procpid
field has also been renamed to pid - along with the same field in
pg_stat_replication for consistency.

Scott Mead and Magnus Hagander, review work from Greg Smith

that's not the case anymore. I can't recall anybody complaining, but to
me it seems fairly confusing to report that some query was running when
it's actually not.

Magnus, was that intentional?  Others, does anybody think we want it
this way?


Both manually signalling and the OOM killer probably are the biggest
source for backends being killed even when there's no queries currently
running.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscripting

2017-09-18 Thread Dmitry Dolgov

> On 17 September 2017 at 23:34, Arthur Zakirov 
wrote:
>
> I have put some thought into it. What about the following syntax?
>
> CREATE SUBSCRIPTING FOR type_name
>   INITFUNC = subscripting_init_func
>   FETCHFUNC = subscripting_fetch_func
>   ASSIGNFUNC = subscripting_assign_func
> DROP SUBSCRIPTING FOR type_name

Just to clarify, do you mean that `CREATE SUBSCRIPTING FOR` would only make
a
dependency record? In this case `DROP SUBSCRIPTING FOR` actually means just
drop an init function.

Re: [HACKERS] SCRAM in the PG 10 release notes

2017-09-18 Thread Noah Misch

On Thu, Sep 14, 2017 at 09:57:36AM +0300, Heikki Linnakangas wrote:
> On 09/12/2017 04:09 AM, Noah Misch wrote:
> >On Wed, May 10, 2017 at 10:50:51PM -0400, Bruce Momjian wrote:
> >>On Mon, May  1, 2017 at 08:12:51AM -0400, Robert Haas wrote:
> >>>On Tue, Apr 25, 2017 at 10:16 PM, Bruce Momjian  wrote:
> Well, we could add "MD5 users are encouraged to switch to
> SCRAM-SHA-256".  Now whether we want to list this as something on the
> SCRAM-SHA-256 description, or mention it as an incompatibility, or
> under Migration.  I am not clear that MD5 is in such terrible shape that
> this is warranted.
> >>>
> >>>I think it's warranted.  The continuing use of MD5 has been a headache
> >>>for some EnterpriseDB customers who have compliance requirements which
> >>>they must meet.  It's not that they themselves necessarily know or
> >>>care whether MD5 is secure, although in some cases they do; it's that
> >>>if they use it, they will be breaking laws or regulations to which
> >>>their business or agency is subject.  I imagine customers of other
> >>>PostgreSQL companies have similar issues.  But leaving that aside, the
> >>>advantage of SCRAM isn't merely that it uses a better algorithm to
> >>>hash the password.  It has other advantages also, like not being
> >>>vulnerable to replay attacks.  If you're doing password
> >>>authentication, you should really be using SCRAM, and encouraging
> >>>people to move to SCRAM after upgrading is a good idea.
> >>>
> >>>That having been said, SCRAM is a wire protocol break.  You will not
> >>>be able to upgrade to SCRAM unless and until the drivers you use to
> >>>connect to the database add support for it.  The only such driver
> >>>that's part of libpq; other drivers that have reimplemented the
> >>>PostgreSQL wire protocol will have to be updated with SCRAM support
> >>>before it will be possible to use SCRAM with those drivers.  I think
> >>>this should be mentioned in the release notes, too.  I also think it
> >>>would be great if somebody would put together a wiki page listing all
> >>>the popular drivers and (1) whether they use libpq or reimplement the
> >>>wire protocol, and (2) if the latter, the status of any efforts to
> >>>implement SCRAM, and (3) if those efforts have been completed, the
> >>>version from which they support SCRAM.  Then, I think we should reach
> >>>out to all of the maintainers of those driver authors who aren't
> >>>moving to support SCRAM and encourage them to do so.
> >>
> >>I have added this as an open item because we will have to wait to see
> >>where we are with driver support as the release gets closer.
> >
> >With the release near, I'm promoting this to the regular open issues section.
> 
> Thanks.
> 
> I updated the list of drivers on the wiki
> (https://wiki.postgresql.org/wiki/List_of_drivers), adding a column for
> whether the driver supports SCRAM authentication. Currently, the only
> non-libpq driver that has implemented SCRAM is the JDBC driver. I submitted
> a patch for the Go driver, but it hasn't been committed yet.
> 
> As for a recommendation in the release notes, maybe something like
> "Installations using MD5 authentication are encouraged to switch to
> SCRAM-SHA-256, unless using older client programs or drivers that don't
> support it yet."

That sounds reasonable.

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 111 matches

Mail list logo