date:20170414

Re: [HACKERS] OK, so culicidae is still broken

2017-04-14 Thread Andres Freund



On April 14, 2017 9:42:41 PM PDT, Tom Lane  wrote:
>Per
>https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-04-15%2004%3A00%3A02
>
>2017-04-15 04:31:21.657 GMT [16792] FATAL:  could not reattach to
>shared memory (key=6280001, addr=0x7f692fece000): Invalid argument
>
>Presumably, this is the same issue we've seen on Windows where the
>shmem address range gets overlapped by code loaded at a randomized
>address.  Is there any real hope of making that work?

Seems to work reasonably regularly on other branches... On phone only, so can't 
dig into details, but it seems there's some chance involved.  Let's see what 
the next few runs will do.  Will crank frequency once home.

Andres

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] OK, so culicidae is still broken

2017-04-14 Thread Tom Lane

Per
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-04-15%2004%3A00%3A02

2017-04-15 04:31:21.657 GMT [16792] FATAL:  could not reattach to shared memory 
(key=6280001, addr=0x7f692fece000): Invalid argument

Presumably, this is the same issue we've seen on Windows where the
shmem address range gets overlapped by code loaded at a randomized
address.  Is there any real hope of making that work?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PANIC in pg_commit_ts slru after crashes

2017-04-14 Thread Pavan Deolasee

On Sat, Apr 15, 2017 at 12:53 AM, Jeff Janes  wrote:

>
> In the first statement executed after crash recovery, I sometimes get this
> error:
>
> PANIC:  XX000: could not access status of transaction 207580505
> DETAIL:  Could not read from file "pg_commit_ts/1EF0" at offset 131072:
> Success.
> LOCATION:  SlruReportIOError, slru.c:918
> STATEMENT:  create temporary table aldjf (x serial)
>
> Other examples:
>
> PANIC:  XX000: could not access status of transaction 3483853232
> DETAIL:  Could not read from file "pg_commit_ts/20742" at offset 237568:
> Success.
> LOCATION:  SlruReportIOError, slru.c:918
> STATEMENT:  create temporary table aldjf (x serial)
>
> PANIC:  XX000: could not access status of transaction 802552883
> DETAIL:  Could not read from file "pg_commit_ts/779E" at offset 114688:
> Success.
> LOCATION:  SlruReportIOError, slru.c:918
> STATEMENT:  create temporary table aldjf (x serial)
>
> Based on the errno, I'm assuming the read was successful but returned the
> wrong number of bytes (which was zero in the case I saw after changing the
> code to log short reads).
>
> It then goes through recovery again and the problem does not immediately
> re-occur if you attempt to connect again.  I don't know why the file size
> would have changed between attempts.
>
>
> The problem bisects to the commit:
>
> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
> Author: Simon Riggs 
> Date:   Tue Apr 4 15:56:56 2017 -0400
>
> Speedup 2PC recovery by skipping two phase state files in normal path
>
>
> It is not obvious to me how that is relevant.  My test doesn't use
> prepared transactions (and leaves the guc at zero), and this commit doesn't
> touch the slru.c.
>
> I'm attaching the test harness.  There is a patch which injects the
> crash-faults and also allows xid fast-forward, a perl script that runs
> until crash and assesses the consistency of the post-crash results, and a
> shell script which sets up the database and then calls the perl script in a
> loop.  On 8 CPU machine, it takes about an hour for the PANIC to occur.
>
> The attached script bails out once it sees the PANIC (count.pl line 158)
> if it didn't do that then it will proceed to connect again and retry, and
> works fine.  No apparent permanent data corruption.
>
> Any clues on how to investigate this further?
>

Since all those offsets fall on a page boundary, my guess is that we're
somehow failing to handle a new page correctly.

Looking at the patch itself, my feeling is that the following code
in src/backend/access/transam/twophase.c might be causing the problem.

1841
1842 /* update nextXid if needed */
1843 if (TransactionIdFollowsOrEquals(maxsubxid,
ShmemVariableCache->nextXid))
1844 {
1845 LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
1846 ShmemVariableCache->nextXid = maxsubxid;
1847 TransactionIdAdvance(ShmemVariableCache->nextXid);
1848 LWLockRelease(XidGenLock);
1849 }

The function PrescanPreparedTransactions() gets called at the start of the
redo recovery and this specific block will get exercised irrespective of
whether there are any prepared transactions or not. What I find
particularly wrong here is that we are initialising maxsubxid to current
value of ShmemVariableCache->nextXid when the function enters, but this
block would then again increment ShmemVariableCache->nextXid, when there
are no prepared transactions in the system.

I wonder if we should do as in attached patch.

It's not entirely clear to me why only CommitTS fails and not other slrus.
One possible reason could be that CommitTS is started before this function
gets called whereas CLOG and SUBTRANS get started after the function
returns.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


fix_commit_ts_crash.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Tom Lane

Andreas Karlsson  writes:
> Looked some at this and what take time now for me seems to mainly be 
> these four things (out of a total runtime of 560 ms).

> 1. setup_conversion:140 ms
> 2. select_default_timezone:  90 ms
> 3. bootstrap_template1:  80 ms
> 4. setup_schema: 65 ms

FWIW, you can bypass select_default_timezone by setting environment
variable TZ to a valid timezone name before calling initdb.  On
my workstation, that currently cuts the time for "initdb --no-sync"
from about 1.25 sec to just about exactly 1.0 sec.

I doubt that we want all the buildfarm members to switch over to
doing that, since then we'd lose coverage of select_default_timezone.
But it's interesting to speculate about having the buildfarm script
extract the selected timezone name out of postgresql.conf after the
first initdb it does, and then set TZ for remaining tests.  We surely
don't need to test select_default_timezone more than once per
buildfarm run.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-14 Thread Noah Misch

On Fri, Apr 14, 2017 at 09:38:32PM +0200, Magnus Hagander wrote:
> On Fri, Apr 14, 2017 at 9:36 PM, Peter Eisentraut 
>  wrote:
> > On 4/14/17 14:45, Magnus Hagander wrote:
> > > Attached is a patch that can be applied to pgweb which should fix all 
> > > of
> > > this.

> Indeed. Fix pushed.

I see  formats as italic, instead of the bold-italic seen in the
9.6 docs.  (This is visible in the CREATE TABLE page, linked upthread.)  Is it
worth restoring the former formatting thereof?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication launcher crash on buildfarm

2017-04-14 Thread Tom Lane

Petr Jelinek  writes:
> So this is what I came up with on plane. Generalized the loading
> functionality into load_library_function which that can load either
> known postgres functions or call load_external_function.

I've had more than enough of seeing buildfarm failures from culicidae,
so I whacked this around until I was happy with it and pushed it.
Further adjustments welcome of course.

> I am not quite sure if fmgr.c is best place to put it, but I didn't want
> to include stuff from executor in bgworker.c.

I really did not care for putting those references into fmgr.c.  Aside
from the layering violations involved, there was a type cheat, in that
you had functions with entirely different signatures in the same list.
I eventually decided that the least ugly solution was to create two
different pointer arrays and frontend functions, one for bgworkers and
one for parallel workers.  That ends up requiring only one extra
export/import, to make ParallelQueryMain accessible in parallel.c.
Maybe we need to rethink the division of labor between parallel.c
and execParallel.c, but that would depend on somebody explaining
what the difference is in the first place.

> As with previous patch, 9.6 will need quite different patch as we need
> to keep compatibility there,

I don't really understand why 9.6 needs a significantly different
patch.  AFAICS, it simply does not work (with any reliability)
for a loadable module to call CreateParallelContext directly in 9.6.
The only thing that actually works is to call the undocumented
CreateParallelContextForExternalFunction.  I suggest that we could
back-patch this as-is, if we add CreateParallelContextForExternalFunction
as a trivial wrapper around CreateParallelContext.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-14 Thread Noah Misch

On Wed, Apr 05, 2017 at 09:51:02PM -0400, Noah Misch wrote:
> On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:

> > > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
> > >> (2)
> > >> There will be still many source comments and documentations that
> > >> we need to update, for example, in high-availability.sgml. We need to
> > >> check and update them throughly.
> > >>
> > >> (3)
> > >> The priority value is assigned to each standby listed in s_s_names
> > >> even in quorum commit though those priority values are not used at all.
> > >> Users can see those priority values in pg_stat_replication.
> > >> Isn't this confusing? If yes, it might be better to always assign 1 as
> > >> the priority, for example.

> > Regarding the item (2), Sawada-san told me that he will work on it after
> > this CommitFest finishes. So we would receive the patch for the item from
> > him next week. If there will be no patch even after the end of next week
> > (i.e., April 14th), I will. Let's wait for Sawada-san's action at first.
> 
> Sounds reasonable; I will look for your update on 14Apr or earlier.

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

> Since you do want (3) to change, please own it like any other open item,
> including the mandatory status updates.

Likewise.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

2017-04-14 Thread Petr Jelinek

Hi, here is updated patch (details inline).

On 13/04/17 20:00, Petr Jelinek wrote:
> Thanks for looking at this!
> 
> On 13/04/17 02:29, Andres Freund wrote:
>> Hi,
>> On 2017-03-03 01:30:11 +0100, Petr Jelinek wrote:
>>
>>> From 7d5b48c8cb80e7c867b2096c999d08feda50b197 Mon Sep 17 00:00:00 2001
>>> From: Petr Jelinek 
>>> Date: Fri, 24 Feb 2017 21:39:03 +0100
>>> Subject: [PATCH 1/5] Reserve global xmin for create slot snasphot export
>>>
>>> Otherwise the VACUUM or pruning might remove tuples still needed by the
>>> exported snapshot.
>>> ---
>>>  src/backend/replication/logical/logical.c | 21 -
>>>  1 file changed, 20 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/backend/replication/logical/logical.c 
>>> b/src/backend/replication/logical/logical.c
>>> index 5529ac8..57c392c 100644
>>> --- a/src/backend/replication/logical/logical.c
>>> +++ b/src/backend/replication/logical/logical.c
>>> @@ -267,12 +267,18 @@ CreateInitDecodingContext(char *plugin,
>>>  * the slot machinery about the new limit. Once that's done the
>>>  * ProcArrayLock can be released as the slot machinery now is
>>>  * protecting against vacuum.
>>> +*
>>> +* Note that we only store the global xmin temporarily in the in-memory
>>> +* state so that the initial snapshot can be exported. After initial
>>> +* snapshot is done global xmin should be reset and not tracked anymore
>>> +* so we are fine with losing the global xmin after crash.
>>>  * 
>>>  */
>>> LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
>>>  
>>> slot->effective_catalog_xmin = GetOldestSafeDecodingTransactionId();
>>> slot->data.catalog_xmin = slot->effective_catalog_xmin;
>>> +   slot->effective_xmin = slot->effective_catalog_xmin;
>>
>>
>>>  void
>>>  FreeDecodingContext(LogicalDecodingContext *ctx)
>>>  {
>>> +   ReplicationSlot *slot = MyReplicationSlot;
>>> +
>>> if (ctx->callbacks.shutdown_cb != NULL)
>>> shutdown_cb_wrapper(ctx);
>>>  
>>> +   /*
>>> +* Cleanup global xmin for the slot that we may have set in
>>> +* CreateInitDecodingContext().
>>
>> Hm.  Is that actually a meaningful point to do so? For one, it gets
>> called by pg_logical_slot_get_changes_guts(), but more importantly, the
>> snapshot is exported till SnapBuildClearExportedSnapshot(), which is the
>> next command?  If we rely on the snapshot magic done by ExportSnapshot()
>> it'd be worthwhile to mention that...
>>
> 
> (Didn't see the patch for couple of months so don't remember all the
> detailed decisions anymore)
> 
> Yes we rely on backend's xmin to be set for exported snapshot. We only
> care about global xmin for exported snapshots really, I assumed that's
> clear enough from "so that the initial snapshot can be exported" but I
> guess there should be more clear comment about this where we actually
> clean this up.
> 

Okay, wrote new comment there, how is it now?

>>
>>> We do not take ProcArrayLock or similar
>>> +* since we only reset xmin here and there's not much harm done by a
>>> +* concurrent computation missing that.
>>> +*/
>>
>> Hum.  I was prepared to complain about this, but ISTM, that there's
>> absolutely no risk because the following
>> ReplicationSlotsComputeRequiredXmin(false); actually does all the
>> necessary locking?  But still, I don't see much point in the
>> optimization.
>>
> 
> Well, if we don't need it in LogicalConfirmReceivedLocation, I don't see
> why we need it here. Please enlighten me.
> 

I kept this as it was, after rereading, the
ReplicationSlotsComputeRequiredXmin(false) will do shared lock while if
we wanted to avoid mutex and do the xmin update under lock we'd need to
do exclusive lock so I think it's worth the optimization...

>>
>>
>>> This patch changes the code so that stored snapshots are only used for
>>> logical decoding restart but not for initial slot snapshot.
>>
>> Yea, that's a very good point...
>>
>>> @@ -1284,13 +1286,13 @@ SnapBuildFindSnapshot(SnapBuild *builder, 
>>> XLogRecPtr lsn, xl_running_xacts *runn
>>>  
>>> return false;
>>> }
>>> -   /* c) valid on disk state */
>>> -   else if (SnapBuildRestore(builder, lsn))
>>> +   /* c) valid on disk state and not exported snapshot */
>>> +   else if (!TransactionIdIsNormal(builder->initial_xmin_horizon) &&
>>> +SnapBuildRestore(builder, lsn))
>>
>> Hm. Is this a good signaling mechanism? It'll also trigger for the SQL
>> interface, where it'd strictly speaking not be required, right?
>>
> 
> Good point. Maybe we should really tell snapshot builder if the snapshot
> is going to be exported or not explicitly (see the rant all the way down).
> 

I added the new signaling mechanism (the new boolean option indicating
if we are building full snapshot which is only set when the snapshot is
exported or used by the transaction).

>>
>>> From 3318a929e691870f3c1ca665bec3bfa8ea2af2a8 Mon Sep 17 00:00:00 2001
>>> From: Petr Jelinek 
>>> Date: Sun,

[HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c

2017-04-14 Thread Erik Rijkers



Testing logical replication, with the following patches on top of 
yesterday's master:


0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  +
0005-Skip-unnecessary-snapshot-builds.patch

Is applying that patch set is still correct?

It builds fine, but when I run the old pbench-over-logical-replication 
test I get:


TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: 
"pgstat.c", Line: 828)


reliably (often within a minute).


The test itself does not fail, at least not that I saw (but I only ran a 
few).



thanks,


Erik Rijkers




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Gavin Flower


On 15/04/17 13:44, Andreas Karlsson wrote:

On 04/14/2017 11:54 PM, Tom Lane wrote:

I failed to resist the temptation to poke at this, and found that
indeed nothing seems to break if we just use one transaction for the
whole processing of postgres.bki.  So I've pushed a patch that does
that.  We're definitely down to the point where worrying about the
speed of bootstrap mode, per se, is useless; the other steps in
initdb visibly take a lot more time.


Looked some at this and what take time now for me seems to mainly be 
these four things (out of a total runtime of 560 ms).


1. setup_conversion:140 ms
2. select_default_timezone:  90 ms
3. bootstrap_template1:  80 ms
4. setup_schema: 65 ms

These four take up about two thirds of the total runtime, so it seems 
likely that we may still have relatively low hanging fruit (but not 
worth committing for PostgreSQL 10).


I have not done profiling of these functions yet, so am not sure how 
they best would be fixed but maybe setup_conversion could be converted 
into bki entries to speed it up.


Andreas



How much could be done concurrently?


Cheers.
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Andreas Karlsson


On 04/14/2017 11:54 PM, Tom Lane wrote:

I failed to resist the temptation to poke at this, and found that
indeed nothing seems to break if we just use one transaction for the
whole processing of postgres.bki.  So I've pushed a patch that does
that.  We're definitely down to the point where worrying about the
speed of bootstrap mode, per se, is useless; the other steps in
initdb visibly take a lot more time.


Looked some at this and what take time now for me seems to mainly be 
these four things (out of a total runtime of 560 ms).


1. setup_conversion:140 ms
2. select_default_timezone:  90 ms
3. bootstrap_template1:  80 ms
4. setup_schema: 65 ms

These four take up about two thirds of the total runtime, so it seems 
likely that we may still have relatively low hanging fruit (but not 
worth committing for PostgreSQL 10).


I have not done profiling of these functions yet, so am not sure how 
they best would be fixed but maybe setup_conversion could be converted 
into bki entries to speed it up.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Different table schema in logical replication crashes

2017-04-14 Thread Petr Jelinek

On 14/04/17 17:33, Peter Eisentraut wrote:
> On 4/14/17 08:49, Petr Jelinek wrote:
>>> Are we prepared to support different schemas in v10? Or should we
>>> disallow it for v10 and add a TODO?
>>>
>>
>> Ah nuts, yes it's supposed to be supported, we seem to not initialize
>> cstate->range_table in tablesync which causes this bug. The CopyState
>> struct is private to copy.c so we can't easily set cstate->range_table
>> externally. I wonder if tablesync should just construct CopyStmt instead
>> of calling the lower level API.
> 
> Maybe pass the range_table to BeginCopyFrom so that it can write it into
> cstate?
> 

I tried something bit different which seems cleaner to me - use the
pstate->r_table instead of ad-hock locally made up range table and fill
that using standard addRangeTableEntryForRelation. Both in tablesync and
in DoCopy instead of the old coding.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From 359c10959f639c1dd3b01f90d459ee4f03ccbd0b Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Sat, 15 Apr 2017 03:27:32 +0200
Subject: [PATCH] Set range table for CopyFrom in tablesync

This changes the mechanism of how range table is passed to CopyFrom
executor state. We used to generate the range table and one entry for
the relation manually inside DoCopy.Now we use
addRangeTableEntryForRelation to setup the range table and relation
entry for the ParseState which is then passed down by BeginCopyFrom.
---
 src/backend/commands/copy.c | 17 +++--
 src/backend/replication/logical/tablesync.c | 13 +++--
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index b5af2be..a786d7b 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -34,6 +34,7 @@
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "parser/parse_relation.h"
 #include "optimizer/clauses.h"
 #include "optimizer/planner.h"
 #include "nodes/makefuncs.h"
@@ -787,7 +788,6 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
Relationrel;
Oid relid;
RawStmt*query = NULL;
-   List   *range_table = NIL;
 
/* Disallow COPY to/from file or program except to superusers. */
if (!pipe && !superuser())
@@ -809,7 +809,6 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
if (stmt->relation)
{
TupleDesc   tupDesc;
-   AclMode required_access = (is_from ? ACL_INSERT : 
ACL_SELECT);
List   *attnums;
ListCell   *cur;
RangeTblEntry *rte;
@@ -822,12 +821,8 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
 
relid = RelationGetRelid(rel);
 
-   rte = makeNode(RangeTblEntry);
-   rte->rtekind = RTE_RELATION;
-   rte->relid = RelationGetRelid(rel);
-   rte->relkind = rel->rd_rel->relkind;
-   rte->requiredPerms = required_access;
-   range_table = list_make1(rte);
+   rte = addRangeTableEntryForRelation(pstate, rel, NULL, false, 
false);
+   rte->requiredPerms = (stmt->is_from ? ACL_INSERT : ACL_SELECT);
 
tupDesc = RelationGetDescr(rel);
attnums = CopyGetAttnums(tupDesc, rel, stmt->attlist);
@@ -841,7 +836,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
else
rte->selectedCols = 
bms_add_member(rte->selectedCols, attno);
}
-   ExecCheckRTPerms(range_table, true);
+   ExecCheckRTPerms(pstate->p_rtable, true);
 
/*
 * Permission check for row security policies.
@@ -977,7 +972,6 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
 
cstate = BeginCopyFrom(pstate, rel, stmt->filename, 
stmt->is_program,
   NULL, stmt->attlist, 
stmt->options);
-   cstate->range_table = range_table;
*processed = CopyFrom(cstate);  /* copy from file to database */
EndCopyFrom(cstate);
}
@@ -2921,6 +2915,9 @@ BeginCopyFrom(ParseState *pstate,
cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
cstate->raw_buf_index = cstate->raw_buf_len = 0;
 
+   /* Assign range table, we'll need it in CopyFrom. */
+   cstate->range_table = pstate->p_rtable;
+
tupDesc = RelationGetDescr(cstate->rel);
attr = tupDesc->attrs;
num_phys_attrs = tupDesc->natts;
diff --git a/src/backend/replication/logical/tablesync.c 
b/src/backend/replication/logical/tablesync.c
index d1f2734..9ba2c01 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -93,6 +93,8 @@
 
 #include "commands/copy.h"
 
+#

Re: [HACKERS] Allowing extended stats on foreign and partitioned tables

2017-04-14 Thread Peter Eisentraut

On 4/10/17 06:18, Ashutosh Bapat wrote:
> This isn't exactly about this particular thread. But I noticed, that
> after we introduced RELKIND_PARTITIONED_TABLE, we required to change a
> number of conditions to include this relkind. We missed some places in
> initial commits and fixed those later. I am wondering whether we
> should creates macros clubbing relevant relkinds together based on the
> purpose of the tests e.g. IS_RELKIND_HAS_STORAGE(). When a new relkind
> is added, one can examine these macros to check whether the new
> relkind fits in the given macro. If all those macros are placed
> together, there is a high chance that we will not miss any place in
> the initial commit itself.

I think this is worth a try.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shouldn't duplicate addition to publication be a no-op?

2017-04-14 Thread Peter Eisentraut

On 4/13/17 06:23, Amit Langote wrote:
> create table bar (a int);
> create publication mypub for table bar;
> alter publication mypub add table bar;
> ERROR:  relation "bar" is already member of publication "mypub"
> 
> 2nd command should be a no-op, IMHO.

We generally require a IF NOT EXISTS in those situations.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Rod Taylor

On Fri, Apr 14, 2017 at 9:09 AM, Stephen Frost  wrote:

> Rod,
>
> * Rod Taylor (rod.tay...@gmail.com) wrote:
> > My actual use-case involves a range. Most users can see and manipulate
> the
> > record when CURRENT_TIMESTAMP is within active_period. Some users
> > (staff/account admins) can see recently dead records too. And a 3rd group
> > (senior staff) have no time restriction, though there are a few customers
> > they cannot see due to their information being a touch more sensitive.
> > I've simplified the below rules to just deal with active_period and the
> > majority of user view (@> CURRENT_TIMESTAMP).
>
> Interesting.
>
> > NOTE: the active_period range is '[)' by default, so records with
> upper() =
> > CURRENT_TIMESTAMP are not visible with @> CURRENT_TIMESTAMP restriction.
>
> Is that really what you intend/want though?  For records with
> upper() = CURRENT_TIMESTAMP to not be visible?  You are able to change
> the range returned from tstzrange by specifying what you want, eg:
>

Yeah, think of it like a delete. Once a record is deleted you want it to
disappear. From the users point of view, who doesn't have time-travel
privileges, an UPDATE to upper() = CURRENT_TIMESTAMP should disappear from
any actions that take place later in the transaction.

A more common way of implementing this is an archive table. Have a DELETE
trigger and shuffle the record to another storage area but with many
dependent tuples via foreign key this can be very time consuming. Flipping
a time period is consistently fast with the caveat that SELECT pays a price.

If you decide Pg shouldn't allow a user to make a tuple disappear, I would
probably make a DO INSTEAD SECURITY DEFINER function that triggers on
DELETE for that role only and changes the time range. Reality is after
about 1 week for customers to contact their account administrator and say
"I accidentally deleted X" it would likely be moved to an archive structure.

select tstzrange(current_timestamp, current_timestamp, '[]');
>
> > CREATE A TABLE t (id integer, active_period tstzrange NOT NULL DEFAULT
> > tstzrange(current_timestamp, NULL));
>
> Why NULL instead of 'infinity'...?
>

Diskspace. NULL works (almost) the same as infinity but the storage is
quite a bit smaller.

>
> > -- Disallowed due to hide_old_select policy.
> > UPDATE t SET active_period = tstzrange(lower(active_period),
> > CURRENT_TIMESTAMP);
>
> Guess I'm still trying to figure out if you really intend for this to
> make the records invisible to the 'most users' case.
>

Yep. It's equivalent to a DELETE or DEACTIVATE. RLS may not be the right
facility but it was very close to working exactly the way I wanted in FOR
ALL mode.

-- 
Rod Taylor

Re: [HACKERS] minor typo in client-auth.sgml

2017-04-14 Thread Peter Eisentraut

On 4/14/17 18:26, Jaime Casanova wrote:
> I was reading about SCRAM and the changes in the docs and found that
> in this sentence (in doc/src/sgml/client-auth.sgml:925) there is a
> missing comma (,) when listing the "password-based authentication
> methods":
> 
> """
> The password-based authentication methods are scram
> md5 and password.
> """
> 

fixed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Tom Lane

Peter Eisentraut  writes:
> If we're talking about making things easier to understand, wouldn't a
> random user rather know what a WAL "location" is instead of a WAL "LSN"?

I wouldn't object to standardizing on "location" instead of "lsn" in the
related function and column names.  What I don't like is using different
words for the same thing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Peter Eisentraut

On 4/14/17 11:36, Bruce Momjian wrote:
> Yeah, this area is complex enough so any consistency we can add helps.

If we're talking about making things easier to understand, wouldn't a
random user rather know what a WAL "location" is instead of a WAL "LSN"?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Peter Eisentraut

On 4/14/17 04:28, Kyotaro HORIGUCHI wrote:
> =# select distinct attname from pg_attribute where attname like '%lsn%';
>attname   
> -
>  confirmed_flush_lsn
>  latest_end_lsn
>  local_lsn
>  receive_start_lsn
>  received_lsn
>  remote_lsn
>  restart_lsn
>  srsublsn
> (8 rows)
> 
> 
> Feature is already frozen, but this seems inconsistent a bit..

I think these are all recently added for logical replication.  We could
rename them to _location.

I'm not a fan of renaming everything the opposite way.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] minor typo in client-auth.sgml

2017-04-14 Thread Jaime Casanova

Hi,

I was reading about SCRAM and the changes in the docs and found that
in this sentence (in doc/src/sgml/client-auth.sgml:925) there is a
missing comma (,) when listing the "password-based authentication
methods":

"""
The password-based authentication methods are scram
md5 and password.
"""

-- 
Jaime Casanova  www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Tom Lane

Andres Freund  writes:
> On 2017-04-13 14:05:43 -0400, Tom Lane wrote:
>> Hm.  That ties into something I was looking at yesterday.  The only
>> reason that function is called so much is that bootstrap mode runs a
>> separate transaction for *each line of the bki file* (cf do_start,
>> do_end in bootparse.y).  Which seems pretty silly.

> Indeed.

I failed to resist the temptation to poke at this, and found that
indeed nothing seems to break if we just use one transaction for the
whole processing of postgres.bki.  So I've pushed a patch that does
that.  We're definitely down to the point where worrying about the
speed of bootstrap mode, per se, is useless; the other steps in
initdb visibly take a lot more time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-14 Thread Petr Jelinek

On 13/04/17 19:31, Fujii Masao wrote:
> On Fri, Apr 14, 2017 at 1:28 AM, Peter Eisentraut
>  wrote:
>> On 4/10/17 13:28, Fujii Masao wrote:
>>>  src/backend/replication/logical/launcher.c
>>>  * Worker started and attached to our shmem. This check is safe
>>>  * because only launcher ever starts the workers, so nobody can 
>>> steal
>>>  * the worker slot.
>>>
>>> The tablesync patch enabled even worker to start another worker.
>>> So the above assumption is not valid for now.
>>>
>>> This issue seems to cause the corner case where the launcher picks up
>>> the same worker slot that previously-started worker has already picked
>>> up to start another worker.
>>
>> I think what the comment should rather say is that workers are always
>> started through logicalrep_worker_launch() and worker slots are always
>> handed out while holding LogicalRepWorkerLock exclusively, so nobody can
>> steal the worker slot.
>>
>> Does that make sense?
> 
> No unless I'm missing something.
> 
> logicalrep_worker_launch() picks up unused worker slot (slot's proc == NULL)
> while holding LogicalRepWorkerLock. But it releases the lock before the slot
> is marked as used (i.e., slot is set to non-NULL). Then newly-launched worker
> calls logicalrep_worker_attach() and marks the slot as used.
> 
> So if another logicalrep_worker_launch() starts after LogicalRepWorkerLock
> is released before the slot is marked as used, it can pick up the same slot
> because that slot looks unused.
> 

Yeah I think it's less of a problem of that comment than the fact that
logicalrep_worker_launch isn't concurrency safe. We need in_use marker
for the workers and update it as needed instead of relying on pgproc.
I'll write up something over the weekend.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-14 Thread Magnus Hagander

On Fri, Apr 14, 2017 at 9:36 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 4/14/17 14:45, Magnus Hagander wrote:
> > Attached is a patch that can be applied to pgweb which should fix
> all of
> > this.
> >
> >
> >
> > Applied, thanks.
>
> Oops, one  too many.
>
>
Indeed. Fix pushed.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-14 Thread Peter Eisentraut

On 4/14/17 14:45, Magnus Hagander wrote:
> Attached is a patch that can be applied to pgweb which should fix all of
> this.
> 
> 
> 
> Applied, thanks. 

Oops, one  too many.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SCRAM authentication, take three

2017-04-14 Thread Peter Eisentraut

On 4/11/17 01:10, Heikki Linnakangas wrote:
> That question won't arise in practice. Firstly, if the server can do 
> scram-sha-256-plus, it presumably can also do scram-sha-512-plus. Unless 
> there's a change in the way the channel binding works, such that the 
> scram-sha-512-plus variant needs a newer version of OpenSSL or 
> something. Secondly, the user's pg_authid row will contain a 
> SCRAM-SHA-256 or SCRAM-SHA-512 verifier, not both, so that will dictate 
> which one to use.

Right.  So putting the actual password method in pg_hba.conf isn't going
to be useful very often.

I think the most practical thing that the user wants in pg_hba.conf is
"best password method supported by what is in pg_authid".  This is
currently spelled "md5", which is of course pretty weird.  And it will
become weirder over time.

I think we want to have a new keyword in pg_hba.conf for that, one which
does not indicate any particular algorithm or method (so not "scram" or
"sasl").

We could use "password".  If we think that "md5" can mean md5-or-beyond,
then maybe "password" can mean password-or-md5-or-beyond.

Or otherwise a completely new word.

We also want to give users/admins a way to phase out old methods or set
some policy.  We could either make a global GUC setting
password_methods='md5 scram-sha-256' and/or make that an option in
pg_hba.conf past the method field.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wincrypt.h vs wincrypt.h

2017-04-14 Thread Magnus Hagander

On Fri, Apr 14, 2017 at 9:27 PM, Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

>
> Commit fe0a0b59 included this line:
>
> #include 
>
> On Windows file names are not case sensitive, so the "W" works fine, but
> as I discovered this morning, if you're cross-compiling on Linux it
> matters very much, and mingw-w64 ships headers with lower case file
> names, in this case "wincrypt.h". Therefore, unless there's an objection
> I propose to lower case that "W".
>

+1. Regardless of the cross compiling, I believe using lowercase is more or
less the convention on Windows for this.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

[HACKERS] Wincrypt.h vs wincrypt.h

2017-04-14 Thread Andrew Dunstan


Commit fe0a0b59 included this line:

#include 

On Windows file names are not case sensitive, so the "W" works fine, but
as I discovered this morning, if you're cross-compiling on Linux it
matters very much, and mingw-w64 ships headers with lower case file
names, in this case "wincrypt.h". Therefore, unless there's an objection
I propose to lower case that "W".

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PANIC in pg_commit_ts slru after crashes

2017-04-14 Thread Jeff Janes

In the first statement executed after crash recovery, I sometimes get this
error:

PANIC:  XX000: could not access status of transaction 207580505
DETAIL:  Could not read from file "pg_commit_ts/1EF0" at offset 131072:
Success.
LOCATION:  SlruReportIOError, slru.c:918
STATEMENT:  create temporary table aldjf (x serial)

Other examples:

PANIC:  XX000: could not access status of transaction 3483853232
DETAIL:  Could not read from file "pg_commit_ts/20742" at offset 237568:
Success.
LOCATION:  SlruReportIOError, slru.c:918
STATEMENT:  create temporary table aldjf (x serial)

PANIC:  XX000: could not access status of transaction 802552883
DETAIL:  Could not read from file "pg_commit_ts/779E" at offset 114688:
Success.
LOCATION:  SlruReportIOError, slru.c:918
STATEMENT:  create temporary table aldjf (x serial)

Based on the errno, I'm assuming the read was successful but returned the
wrong number of bytes (which was zero in the case I saw after changing the
code to log short reads).

It then goes through recovery again and the problem does not immediately
re-occur if you attempt to connect again.  I don't know why the file size
would have changed between attempts.


The problem bisects to the commit:

commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
Author: Simon Riggs 
Date:   Tue Apr 4 15:56:56 2017 -0400

Speedup 2PC recovery by skipping two phase state files in normal path


It is not obvious to me how that is relevant.  My test doesn't use prepared
transactions (and leaves the guc at zero), and this commit doesn't touch
the slru.c.

I'm attaching the test harness.  There is a patch which injects the
crash-faults and also allows xid fast-forward, a perl script that runs
until crash and assesses the consistency of the post-crash results, and a
shell script which sets up the database and then calls the perl script in a
loop.  On 8 CPU machine, it takes about an hour for the PANIC to occur.

The attached script bails out once it sees the PANIC (count.pl line 158) if
it didn't do that then it will proceed to connect again and retry, and
works fine.  No apparent permanent data corruption.

Any clues on how to investigate this further?

Cheers,

Jeff


count.pl
Description: Binary data


crash_REL10.patch
Description: Binary data


do.sh
Description: Bourne shell script


slru_log_read_size.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-14 Thread Peter Eisentraut

On 4/11/17 09:03, Magnus Hagander wrote:
> I would expect most enterprise customers who care about MITM protection
> are already using either TLS or ipsec to cover that already. They have
> benefit from the other parts of SCRAM, but they've already solved those
> problems.

Yeah, I think if you're concerned about MITM then you would also be
concerned about MITM siphoning off your data.  So you should be using
TLS and then you don't need channel binding.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Petr Jelinek

On 14/04/17 21:05, Peter Eisentraut wrote:
> On 4/14/17 14:23, Petr Jelinek wrote:
>> Ah yes looking at the code, it does exactly that (on master only). Means
>> that backport will be necessary.
> 
> I think these two sentences are contradicting each other.
> 

Hehe, didn't realize master will be taken as master branch, I meant
master as in not standby :)

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Peter Eisentraut

On 4/14/17 14:23, Petr Jelinek wrote:
> Ah yes looking at the code, it does exactly that (on master only). Means
> that backport will be necessary.

I think these two sentences are contradicting each other.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interval for launching the table sync worker

2017-04-14 Thread Peter Eisentraut

On 4/13/17 18:09, Petr Jelinek wrote:
> It would have the
> disadvantage that if some tables were consistently failing, no other
> tables could get synchronized as the amount of workers is limited. OTOH
> the approach chosen in the patch will first try all tables and only then
> start rate limiting, not quite sure which is better.

I think the latter is better.

One of the scenarios cited originally somewhere was one table failing
because of an encoding issue.  So it's useful to allow other tables to
continue.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interval for launching the table sync worker

2017-04-14 Thread Peter Eisentraut

On 4/13/17 06:23, Masahiko Sawada wrote:
> Attached the latest patch. It didn't actually necessary to change
> GetSubscriptionNotReadyRelations. I just changed the logic refreshing
> the sync table state list.
> Please give me feedback.

I think this is a reasonable direction, but do we really need
table_sync_retry_interval separate from wal_retrieve_retry_interval?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication and inheritance

2017-04-14 Thread Peter Eisentraut

On 4/13/17 01:00, Noah Misch wrote:
>> Relative to some of the other open items I'm involved in, I consider
>> this a low priority, so I'm not working on it right now.
> 
> By what day should the community look for your next update?

Tuesday

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication and inheritance

2017-04-14 Thread Peter Eisentraut

On 4/13/17 06:48, Amit Langote wrote:
> That is an important consideration because of pg_dump.  See below:
> 
> create table foo (a int);
> create table bar () inherits (foo);
> create publication mypub for table foo;  -- bar is added too.
> 
> $ pg_dump -s | psql -e test
> 
> ALTER PUBLICATION mypub ADD TABLE foo;
> ERROR:  relation "bar" is already member of publication "mypub"

To fix this, pg_dump should emit ADD TABLE ONLY foo.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Andreas Karlsson


On 04/13/2017 06:13 PM, Tom Lane wrote:

I've pushed this with some mostly-cosmetic adjustments:


Thanks! I like your adjustments.


There's certainly lots more that could be done in the genbki code,
but I think all we can justify at this stage of the development
cycle is to get the low-hanging fruit for testing speedups.


Yeah, I also noticed that the genbki code seems to have gotten little 
love and that much more can be done here.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-14 Thread Magnus Hagander

On Sat, Apr 8, 2017 at 3:52 AM, Bruce Momjian  wrote:

> On Fri, Mar 24, 2017 at 07:01:46AM +0100, Fabien COELHO wrote:
> >
> > Hello Peter,
> >
> > >I think the fix belongs into the web site CSS, so there is nothing to
> > >commit into PostgreSQL here.
> >
> > Indeed, the changes were only for the "remove nesting" solution.
> >
> > >I will close the commit fest entry, but I have added a section to the
> open
> > >items list so we keep track of it. (https://wiki.postgresql.org/
> wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)
> >
> > I put forward that the quick workaround a colleague of mine suggested
> (aka
> > something like code code { font-size: 100%; important! }) could also be
> > applied to the web site CSS while waiting for a more definite answer
> which
> > might take some pretty unknown time close to never?
>
> Sorry I am just getting back to this.  Below I am going to cover only
> the problem with the font size of nested  tags, and I am going to
> confirm what most people already figured out.
>
> The basic problem was already posted by Fabien, with an image example.
> The cause of the fonts being too large on Chrome is an interaction of
> Chrome's default font size for different blocks, the JavaScript that is
> meant to fix such mismatches, and the new nested code blocks in the PG
> 10 docs.
>
> First, the JavaScript:
>
> https://github.com/postgres/pgweb/blob/master/media/js/
> monospacefix.js
>
> There is no git history for this file except for its initial checkin in
> 2011, but I am pretty sure I wrote it.  What it does is to create 
> and  blocks, find the font point size, and compute a ratio.  If the
> ratio is not 1, , , and  blocks are adjusted in size to
> match .  The complex part is that the JavaScript conditionally
> injects CSS into the web-page to accomplish this.
>
> The reason the PG 10 docs look fine on Linux Firefox is because the font
> points sizes match so no CSS is injected.  They don't match on Chrome,
> so the CSS is injected.  When the CSS hits double-embedded code blocks,
>  , it makes the font too large because it double-adjusts.
>
> The fix, as Fabien identified, is to conditionally inject additional CSS
> to be _more_ specific than the first CSS and set the font-size to a
> simple '1em' so the first CSS is not called twice.  I don't think
> 'important!' is necessary but it would be good to test this.
>
> Attached is a patch that can be applied to pgweb which should fix all of
> this.
>
>

Applied, thanks.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] logical replication worker and statistics

2017-04-14 Thread Peter Eisentraut

On 4/14/17 12:09, Petr Jelinek wrote:
> Indeed, if catchup phase didn't happen (because tablesync was faster
> than apply) then the commit handler is never called so all the changes
> made by copy would be forgotten. Also the tablesync worker might exit
> from process_syncing_tables() call which is called before we report
> stats in the commit handler so again some changes might be forgotten.
> 
> I attached modified version of the patch that also reports stats in
> finish_sync_worker() when there is outstanding transaction. The test can
> stay the same.

committed (without the tests)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Petr Jelinek

On 14/04/17 19:33, Fujii Masao wrote:
> On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek
>  wrote:
>> On 12/04/17 15:55, Fujii Masao wrote:
>>> Hi,
>>>
>>> When I shut down the publisher while I repeated creating and dropping
>>> the subscription in the subscriber, the publisher emitted the following
>>> PANIC error during shutdown checkpoint.
>>>
>>> PANIC:  concurrent transaction log activity while database system is
>>> shutting down
>>>
>>> The cause of this problem is that walsender for logical replication can
>>> generate WAL records even during shutdown checkpoint.
>>>
>>> Firstly walsender keeps running until shutdown checkpoint finishes
>>> so that all the WAL including shutdown checkpoint record can be
>>> replicated to the standby. This was safe because previously walsender
>>> could not generate WAL records. However this assumption became
>>> invalid because of logical replication. That is, currenty walsender for
>>> logical replication can generate WAL records, for example, by executing
>>> CREATE_REPLICATION_SLOT command. This is an oversight in
>>> logical replication patch, I think.
>>
>> Hmm, but CREATE_REPLICATION_SLOT should not generate WAL afaik. I agree
>> that the issue with walsender still exist (since we now allow normal SQL
>> to run there) but I think it's important to identify what exactly causes
>> the WAL activity in your case
> 
> At least in my case, the following CREATE_REPLICATION_SLOT command
> generated WAL record.
> 
> BEGIN READ ONLY ISOLATION LEVEL REPEATABLE READ;
> CREATE_REPLICATION_SLOT testslot TEMPORARY LOGICAL pgoutput USE_SNAPSHOT;
> 
> Here is the pg_waldump output of the WAL record that CREATE_REPLICATION_SLOT
> generated.
> 
> rmgr: Standby len (rec/tot): 24/50, tx:  0,
> lsn: 0/01601438, prev 0/01601400, desc: RUNNING_XACTS nextXid 692
> latestCompletedXid 691 oldestRunningXid 692
> 
> So I guess that CREATE_REPLICATION_SLOT code calls LogStandbySnapshot()
> and which generates WAL record about snapshot of running transactions.
> 

Ah yes looking at the code, it does exactly that (on master only). Means
that backport will be necessary.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2017-04-14 Thread Peter Geoghegan

On Fri, Apr 14, 2017 at 5:57 AM, Robert Haas  wrote:
> I don't think there's any one fixed answer, because increasing the
> number of tapes reduces I/O by adding CPU cost, and visca versa.

Sort of, but if you have to merge hundreds of runs (a situation that
should be quite rare), then you should be concerned about being CPU
bound first, as Knuth was. Besides, on modern hardware, read-ahead can
be more effective if you have more merge passes, to a point, which
might also make it worth it -- using hundreds of tapes results in
plenty of *random* I/O. Plus, most of the time you only do a second
pass over a subset of initial quicksorted runs -- not all of them.

Probably the main complicating factor that Knuth doesn't care about is
time to return the first tuple -- startup cost. That was a big
advantage for commit df700e6 that I should have mentioned.

I'm not seriously suggesting that we should prefer multiple passes in
the vast majority of real world cases, nor am I suggesting that we
should go out of our way to help cases that need to do that. I just
find all this interesting.

-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication apply to run with sync commit off by default

2017-04-14 Thread Peter Eisentraut

On 4/13/17 17:52, Petr Jelinek wrote:
> On 12/04/17 06:10, Peter Eisentraut wrote:
>> On 3/24/17 10:49, Petr Jelinek wrote:
>>> Rebase after table copy patch got committed.
>>
>> You could please sent another rebase of this, perhaps with some more
>> documentation, as discussed downthread.
>>
>> Also, I wonder why we don't offer the other values of
>> synchronous_commit.  In any case, we should keep the values consistent.
>> So on should be on and local should be local, but not mixing it.
>>
> 
> Now that I am back from vacation I took look at this again. The reason
> why I did only boolean initially is that he other values just didn't
> seem all that useful. But you are right it's better to be consistent and
> there is no real reason why not to allow all of the possible values.
> 
> So how about the attached?

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Fujii Masao

On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek
 wrote:
> On 12/04/17 15:55, Fujii Masao wrote:
>> Hi,
>>
>> When I shut down the publisher while I repeated creating and dropping
>> the subscription in the subscriber, the publisher emitted the following
>> PANIC error during shutdown checkpoint.
>>
>> PANIC:  concurrent transaction log activity while database system is
>> shutting down
>>
>> The cause of this problem is that walsender for logical replication can
>> generate WAL records even during shutdown checkpoint.
>>
>> Firstly walsender keeps running until shutdown checkpoint finishes
>> so that all the WAL including shutdown checkpoint record can be
>> replicated to the standby. This was safe because previously walsender
>> could not generate WAL records. However this assumption became
>> invalid because of logical replication. That is, currenty walsender for
>> logical replication can generate WAL records, for example, by executing
>> CREATE_REPLICATION_SLOT command. This is an oversight in
>> logical replication patch, I think.
>
> Hmm, but CREATE_REPLICATION_SLOT should not generate WAL afaik. I agree
> that the issue with walsender still exist (since we now allow normal SQL
> to run there) but I think it's important to identify what exactly causes
> the WAL activity in your case

At least in my case, the following CREATE_REPLICATION_SLOT command
generated WAL record.

BEGIN READ ONLY ISOLATION LEVEL REPEATABLE READ;
CREATE_REPLICATION_SLOT testslot TEMPORARY LOGICAL pgoutput USE_SNAPSHOT;

Here is the pg_waldump output of the WAL record that CREATE_REPLICATION_SLOT
generated.

rmgr: Standby len (rec/tot): 24/50, tx:  0,
lsn: 0/01601438, prev 0/01601400, desc: RUNNING_XACTS nextXid 692
latestCompletedXid 691 oldestRunningXid 692

So I guess that CREATE_REPLICATION_SLOT code calls LogStandbySnapshot()
and which generates WAL record about snapshot of running transactions.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make defines in pg_config_manual.h conditional?

2017-04-14 Thread Tom Lane

Andres Freund  writes:
> ATM the defines in pg_config_manual.h are largely unconditional - which
> means the file has to be edited instead of being able to override things
> via CFLAGS/COPT.  Is there any good reason for doing so?

YES.  If you do things like that, the installed version of
pg_config_manual.h will not reflect the way the backend was built,
which is likely to break pgxs-type builds of addon modules later.

Just edit the file.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Make defines in pg_config_manual.h conditional?

2017-04-14 Thread Andres Freund

Hi,

ATM the defines in pg_config_manual.h are largely unconditional - which
means the file has to be edited instead of being able to override things
via CFLAGS/COPT.  Is there any good reason for doing so?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-14 Thread Daniel Verite

Fabien COELHO wrote:

> Pavel also suggested some support for TEXT, although I would like to see a 
> use case. That could be another extension to the engine.

SQLSTATE is text.

Also when issuing "psql -v someoption=value -f script", it's reasonable
to want to compare :someoptionvar to 'somevalue' in the script.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication worker and statistics

2017-04-14 Thread Petr Jelinek

On 12/04/17 05:41, Peter Eisentraut wrote:
> On 4/10/17 13:06, Stas Kelvich wrote:
>>
>>> On 10 Apr 2017, at 19:50, Peter Eisentraut 
>>>  wrote:
>>>
>>> On 4/10/17 05:49, Stas Kelvich wrote:
 Here is small patch to call statistics in logical worker. Originally i 
 thought that stat
 collection during logical replication should manually account amounts of 
 changed tuples,
 but seems that it is already smoothly handled on relation level. So call 
 to 
 pgstat_report_stat() is enough.
>>>
>>> I wonder whether we need a similar call somewhere in tablesync.c.  It
>>> seems to work without it, though.
>>
>> I thought it spins up the same worker from worker.c.
> 
> It depends on which of the various tablesync scenarios happens.  If the
> apply lags behind the tablesync, then the apply doesn't process commit
> messages until it has caught up.  So the part of the code you patched
> wouldn't be called.
> 

Indeed, if catchup phase didn't happen (because tablesync was faster
than apply) then the commit handler is never called so all the changes
made by copy would be forgotten. Also the tablesync worker might exit
from process_syncing_tables() call which is called before we report
stats in the commit handler so again some changes might be forgotten.

I attached modified version of the patch that also reports stats in
finish_sync_worker() when there is outstanding transaction. The test can
stay the same.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From 231973cafe12bb948924a6380ac785d93b6af086 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Fri, 14 Apr 2017 17:54:03 +0200
Subject: [PATCH] Report statistics in logical replication workers

---
 src/backend/postmaster/pgstat.c | 9 +
 src/backend/replication/logical/tablesync.c | 8 +++-
 src/backend/replication/logical/worker.c| 1 +
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 3fb57f0..36fedd8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -769,10 +769,11 @@ allow_immediate_pgstat_restart(void)
 /* --
  * pgstat_report_stat() -
  *
- *	Called from tcop/postgres.c to send the so far collected per-table
- *	and function usage statistics to the collector.  Note that this is
- *	called only when not within a transaction, so it is fair to use
- *	transaction stop time as an approximation of current time.
+ *	Must be called by processes that performs DML: tcop/postgres.c, logical
+ *	receiver processes, SPI worker, etc to send the so far collected per-table
+ *	and function usage statistics to the collector. Note that this is called
+ *	only when not within a transaction, so it is fair to use transaction stop
+ *	time as an approximation of current time.
  * --
  */
 void
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index d1f2734..aa1a85b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -114,9 +114,15 @@ StringInfo	copybuf = NULL;
 static void pg_attribute_noreturn()
 finish_sync_worker(void)
 {
-	/* Commit any outstanding transaction. */
+	/*
+	 * Commit any outstanding transaction. This is the usual case, unless
+	 * there was nothing to do for the table.
+	 */
 	if (IsTransactionState())
+	{
 		CommitTransactionCommand();
+		pgstat_report_stat(false);
+	}
 
 	/* And flush all writes. */
 	XLogFlush(GetXLogWriteRecPtr());
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3313448..169471c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -462,6 +462,7 @@ apply_handle_commit(StringInfo s)
 	/* Process any tables that are being synchronized in parallel. */
 	process_syncing_tables(commit_data.end_lsn);
 
+	pgstat_report_stat(false);
 	pgstat_report_activity(STATE_IDLE, NULL);
 }
 
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-14 Thread Jaime Casanova

On 5 April 2017 at 13:32, Pavan Deolasee  wrote:
>
> Ok. I've extensively updated the README to match the current state of
> affairs. Updated patch set attached.

Hi Pavan,

I run a test on current warm patchset, i used pgbench with a scale of
20 and a fillfactor of 90 and then start the pgbench run with 6
clients in parallel i also run sqlsmith on it.

And i got a core dump after sometime of those things running.

The assertion that fails is:

"""
LOG:  statement: UPDATE pgbench_tellers SET tbalance = tbalance + 3519
WHERE tid = 34;
TRAP: FailedAssertion("!(((bool) (((const void*)(&tup->t_ctid) !=
((void *)0)) && (((&tup->t_ctid)->ip_posid & uint16) 1) << 13) -
1)) != 0", File: "../../../../src/include/access/htup_details.h",
Line: 659)
"""

-- 
Jaime Casanova  www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
#0  0x7f42d832e067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f42d832f448 in __GI_abort () at abort.c:89
#2  0x00842961 in ExceptionalCondition (conditionName=, 
errorType=, fileName=, lineNumber=)
at assert.c:54
#3  0x004c48e6 in HeapTupleHeaderGetRootOffset (tup=) at 
../../../../src/include/access/htup_details.h:659
#4  heap_update (relation=0x7f4294dc8168, otid=0x2660ff0, newtup=0x6, cid=2, 
crosscheck=0x7f42d91bc700, wait=-102 '\232', hufd=0x7ffd14fcebb0, 
lockmode=0x7ffd14fceb9c, modified_attrsp=0x7ffd14fceba8, 
warm_update=0x7ffd14fceb9b "") at heapam.c:4672
#5  0x0062e95d in ExecUpdate (tupleid=0x7ffd14fcec90, oldtuple=0x2324, 
slot=0x26606f8, planSlot=0x, epqstate=0x2660ff0, 
estate=0x25db9e0, 
canSetTag=1 '\001') at nodeModifyTable.c:1012
#6  0x0062f009 in ExecModifyTable (node=0x25dbc48) at 
nodeModifyTable.c:1609
#7  0x00616198 in ExecProcNode (node=node@entry=0x25dbc48) at 
execProcnode.c:424
#8  0x006116a6 in ExecutePlan (execute_once=, 
dest=0x262e498, direction=, numberTuples=0, 
sendTuples=, 
operation=CMD_UPDATE, use_parallel_mode=, 
planstate=0x25dbc48, estate=0x25db9e0) at execMain.c:1651
#9  standard_ExecutorRun (queryDesc=0x2666100, direction=, 
count=0, execute_once=) at execMain.c:360
#10 0x00746822 in ProcessQuery (plan=, 
sourceText=0x25ac4e0 "UPDATE pgbench_tellers SET tbalance = tbalance + 3494 
WHERE tid = 76;", 
params=0x0, queryEnv=0x0, dest=0x262e498, completionTag=0x7ffd14fcf010 "") 
at pquery.c:162
#11 0x00746a93 in PortalRunMulti (portal=portal@entry=0x25462d0, 
isTopLevel=isTopLevel@entry=1 '\001', setHoldSnapshot=setHoldSnapshot@entry=0 
'\000', 
dest=dest@entry=0x262e498, altdest=altdest@entry=0x262e498, 
completionTag=completionTag@entry=0x7ffd14fcf010 "") at pquery.c:1287
#12 0x007476a2 in PortalRun (portal=0x25462d0, 
count=9223372036854775807, isTopLevel=, run_once=, dest=0x262e498, 
altdest=0x262e498, completionTag=0x7ffd14fcf010 "") at pquery.c:800
#13 0x0074361b in exec_simple_query (query_string=0x2324 ) at postgres.c:1105
#14 0x00745254 in PostgresMain (argc=1, argv=0x25ac4e0, 
dbname=0x2555888 "pgbench", username=0x2528260 "jcasanov") at postgres.c:4075
#15 0x004780f3 in BackendRun (port=0x254dd70) at postmaster.c:4317
#16 BackendStartup (port=0x254dd70) at postmaster.c:3989
#17 ServerLoop () at postmaster.c:1729
#18 0x006d33d0 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x25260c0) at postmaster.c:1337
#19 0x00478f4d in main (argc=3, argv=0x25260c0) at main.c:228


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Different table schema in logical replication crashes

2017-04-14 Thread Petr Jelinek



On 14/04/17 17:33, Peter Eisentraut wrote:
> On 4/14/17 08:49, Petr Jelinek wrote:
>>> Are we prepared to support different schemas in v10? Or should we
>>> disallow it for v10 and add a TODO?
>>>
>>
>> Ah nuts, yes it's supposed to be supported, we seem to not initialize
>> cstate->range_table in tablesync which causes this bug. The CopyState
>> struct is private to copy.c so we can't easily set cstate->range_table
>> externally. I wonder if tablesync should just construct CopyStmt instead
>> of calling the lower level API.
> 
> Maybe pass the range_table to BeginCopyFrom so that it can write it into
> cstate?
> 

That would work. The reason why I am thinking of creating CopyStmt
instead is that to create the range_table, we'll basically have to
duplicate the code from DoCopy verbatim. Obviously making CopyStmt isn't
without troubles either as it would have to newly support the callback
input.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-14 Thread Peter Eisentraut

On 4/14/17 11:35, Bruce Momjian wrote:
> I think the only open issue is the JavaScript CSS font fix that has a
> posted patch here:
> 
>   https://www.postgresql.org/message-id/20170408015201.ga18...@momjian.us

The open item is specifically about this issue.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > On Fri, Apr 14, 2017 at 4:28 AM, Kyotaro HORIGUCHI
> >  wrote:
> >> Similariliy, these columns may need renaming.
> 
> > Personally, I would be inclined not to tinker with this, not just
> > because we're after freeze but because it doesn't seem like an
> > improvement to me.  Referring to an LSN as  location seems fine to me;
> > I mean, granted, it's one specific kind of location, but that doesn't
> > make it wrong.
> 
> In a green field it would be perfectly fine --- but I think Kyotaro-san's
> point is about consistency.  If all the other exposed names that involve
> the same concept use "lsn", then it's fair to say that it's a bad idea
> for these four column names to be randomly different from the rest.
> 
> Now this is a pre-existing problem: those column names existed in 9.6,
> and so did some of the ones named using "lsn".  But we've added more
> of the latter in v10.  I think the real problem right now is that nobody
> has a rule to follow about which way to name new exposed references to
> the concept, and that's simply bad.
> 
> It's possible that we should say that backwards compatibility outweighs
> consistency and therefore it's too late to change these names.  But
> I think your argument above is missing the point.

I agree and definitely view 'lsn' as better than just 'location' when
we're talking about an lsn.  The datatype is 'pg_lsn', let's use 'lsn'
whenever that's what it is.  Consistency here is really good.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Bruce Momjian

On Fri, Apr 14, 2017 at 10:22:36AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Fri, Apr 14, 2017 at 4:28 AM, Kyotaro HORIGUCHI
> >  wrote:
> >> Similariliy, these columns may need renaming.
> 
> > Personally, I would be inclined not to tinker with this, not just
> > because we're after freeze but because it doesn't seem like an
> > improvement to me.  Referring to an LSN as  location seems fine to me;
> > I mean, granted, it's one specific kind of location, but that doesn't
> > make it wrong.
> 
> In a green field it would be perfectly fine --- but I think Kyotaro-san's
> point is about consistency.  If all the other exposed names that involve
> the same concept use "lsn", then it's fair to say that it's a bad idea
> for these four column names to be randomly different from the rest.
> 
> Now this is a pre-existing problem: those column names existed in 9.6,
> and so did some of the ones named using "lsn".  But we've added more
> of the latter in v10.  I think the real problem right now is that nobody
> has a rule to follow about which way to name new exposed references to
> the concept, and that's simply bad.
> 
> It's possible that we should say that backwards compatibility outweighs
> consistency and therefore it's too late to change these names.  But
> I think your argument above is missing the point.

Yeah, this area is complex enough so any consistency we can add helps.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interval for launching the table sync worker

2017-04-14 Thread Masahiko Sawada

On Fri, Apr 14, 2017 at 9:14 PM, Petr Jelinek
 wrote:
> On 14/04/17 12:18, Masahiko Sawada wrote:
>> On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
>>  wrote:
>>> On 13/04/17 12:23, Masahiko Sawada wrote:
 On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada  
 wrote:
> On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
>  wrote:
>> On 4/12/17 00:48, Masahiko Sawada wrote:
>>> On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
 Perhaps instead of a global last_start_time, we store a per relation
 last_start_time in SubscriptionRelState?
>>>
>>> I was thinking the same. But a problem is that the list of
>>> SubscriptionRelState is refreshed whenever the syncing table state
>>> becomes invalid (table_state_valid = false). I guess we need to
>>> improve these logic including GetSubscriptionNotReadyRelations().
>>
>> The table states are invalidated on a syscache callback from
>> pg_subscription_rel, which happens roughly speaking when a table
>> finishes the initial sync.  So if we're worried about failing tablesync
>> workers relaunching to quickly, this would only be a problem if a
>> tablesync of another table finishes right in that restart window.  That
>> doesn't seem a terrible issue to me.
>>
>
> I think the table states are invalidated whenever the table sync
> worker starts, because the table sync worker updates its status of
> pg_subscription_rel and commits it before starting actual copy. So we
> cannot rely on that. I thought we can store last_start_time into
> pg_subscription_rel but it might be overkill. I'm now thinking to
> change GetSubscriptionNotReadyRealtions so that last_start_time in
> SubscriptionRelState is taken over to new list.
>

 Attached the latest patch. It didn't actually necessary to change
 GetSubscriptionNotReadyRelations. I just changed the logic refreshing
 the sync table state list.
 Please give me feedback.

>>>
>>> Hmm this might work. Although I was actually wondering if we could store
>>> the last start timestamp in the worker shared memory and do some magic
>>> with that (ie, not clearing subid and relid and try to then do rate
>>> limiting based on subid+relid+timestamp stored in shmem). That would
>>> then work same way for the main apply workers as well. It would have the
>>> disadvantage that if some tables were consistently failing, no other
>>> tables could get synchronized as the amount of workers is limited.
>>
>> Hmm I guess that it's not a good design that a table sync worker and a
>> apply worker for a relation takes sole possession of a worker slot
>> until it successes. It would bother each other.
>>
>
> Not sure what you mean by apply worker for relation, I meant the main
> subscription apply worker, the "per relation apply worker" is same as
> tablesync worker.

Oops, I made a mistake. I meant just a apply worker.

>
> Thinking about the possible failure scenarios more, I guess you are
> right. Because if there is "global" event for the publisher/subscriber
> connection (ie network goes down or something) it will affect the main
> worker as well so it won't launch anything. If there is table specific
> issue it will affect only that table.
>
> The corner-cases with doing it per table where we launch tablesync
> needlessly are pretty much just a) connection limit on publisher
> reached, b) slot limit  on publisher reached, that's probably okay. We
> might be able to catch these two specifically and do some global
> limiting based on those (something like your original patch except the
> start-time global variable would only be set on specific error and
> stored in shmem), but that does not need to be solved immediately.
>
> --
>   Petr Jelinek  http://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services


Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-14 Thread Bruce Momjian

On Fri, Apr 14, 2017 at 11:25:48AM -0400, Peter Eisentraut wrote:
> On 4/14/17 01:49, Noah Misch wrote:
> > On Wed, Apr 05, 2017 at 01:43:42PM -0400, Peter Eisentraut wrote:
> >> On 4/5/17 02:56, Noah Misch wrote:
> >>> On Thu, Mar 23, 2017 at 11:21:39PM -0400, Peter Eisentraut wrote:
>  I think the fix belongs into the web site CSS, so there is nothing to
>  commit into PostgreSQL here.  I will close the commit fest entry, but I
>  have added a section to the open items list so we keep track of it.
>  (https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)
> >>>
> >>> [Action required within three days.  This is a generic notification.]
> >>
> >> I will work on this next week.  I believe I will be able to provide a
> >> patch for the web site CSS by April 12, but ultimate success will likely
> >> depend on the collaboration of the web team.
> > 
> > This PostgreSQL 10 open item is past due for your status update.  Kindly 
> > send
> > a status update within 24 hours, and include a date for your subsequent 
> > status
> > update.  Refer to the policy on open item ownership:
> > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> 
> A patch for the web site CSS has been proposed but it is not getting the
> same urgent attention.  I will report back by Tuesday.

Uh, unless I am missing something, I think it has been applied to pgweb,
specifically on April 12:

https://github.com/postgres/pgweb/commits/master

and this April 12 commit fixed the heading color:


https://github.com/postgres/pgweb/commit/c513457da76a4d864f706c21312d3def1fc8f8c8

I think the only open issue is the JavaScript CSS font fix that has a
posted patch here:

https://www.postgresql.org/message-id/20170408015201.ga18...@momjian.us

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Different table schema in logical replication crashes

2017-04-14 Thread Peter Eisentraut

On 4/14/17 08:49, Petr Jelinek wrote:
>> Are we prepared to support different schemas in v10? Or should we
>> disallow it for v10 and add a TODO?
>>
> 
> Ah nuts, yes it's supposed to be supported, we seem to not initialize
> cstate->range_table in tablesync which causes this bug. The CopyState
> struct is private to copy.c so we can't easily set cstate->range_table
> externally. I wonder if tablesync should just construct CopyStmt instead
> of calling the lower level API.

Maybe pass the range_table to BeginCopyFrom so that it can write it into
cstate?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-14 Thread Peter Eisentraut

On 4/14/17 01:49, Noah Misch wrote:
> On Wed, Apr 05, 2017 at 01:43:42PM -0400, Peter Eisentraut wrote:
>> On 4/5/17 02:56, Noah Misch wrote:
>>> On Thu, Mar 23, 2017 at 11:21:39PM -0400, Peter Eisentraut wrote:
 I think the fix belongs into the web site CSS, so there is nothing to
 commit into PostgreSQL here.  I will close the commit fest entry, but I
 have added a section to the open items list so we keep track of it.
 (https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)
>>>
>>> [Action required within three days.  This is a generic notification.]
>>
>> I will work on this next week.  I believe I will be able to provide a
>> patch for the web site CSS by April 12, but ultimate success will likely
>> depend on the collaboration of the web team.
> 
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

A patch for the web site CSS has been proposed but it is not getting the
same urgent attention.  I will report back by Tuesday.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-14 Thread Peter Eisentraut

On 4/7/17 21:52, Bruce Momjian wrote:
> The fix, as Fabien identified, is to conditionally inject additional CSS
> to be _more_ specific than the first CSS and set the font-size to a
> simple '1em' so the first CSS is not called twice.  I don't think
> 'important!' is necessary but it would be good to test this.
> 
> Attached is a patch that can be applied to pgweb which should fix all of
> this.

IMO, this patch is correct and the best way to fix this.  Please proceed.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for pg_basebackup argument parsing

2017-04-14 Thread Pierre Ducroquet

On Friday, April 14, 2017 8:59:03 AM CEST Robert Haas wrote:
> On Fri, Apr 14, 2017 at 2:32 AM, Pierre Ducroquet  
wrote:
> > On Friday, April 14, 2017 8:44:37 AM CEST Michael Paquier wrote:
> >> On Fri, Apr 14, 2017 at 6:32 AM, Pierre Ducroquet 
> > 
> > wrote:
> >> > Yesterday while doing a few pg_basebackup, I realized that the integer
> >> > parameters were not properly checked against invalid input.
> >> > It is not a critical issue, but this could be misleading for an user
> >> > who
> >> > writes -z max or -s 0.5…
> >> > I've attached the patch to this mail. Should I add it to the next
> >> > commit
> >> > fest or is it not needed for such small patches ?
> >> 
> >> A call to atoi is actually equivalent to strtol with the rounding:
> >> (int)strtol(str, (char **)NULL, 10);
> >> So I don't think this is worth caring.
> > 
> > The problem with atoi is that it simply ignores any invalid input and
> > returns 0 instead.
> > That's why I did this patch, because I did a typo when calling
> > pg_basebackup and was not warned for an invalid input.
> 
> I agree.  I think it would be worth going through and cleaning up
> every instance of this in the source tree.

Well, I don't have much to do during a train travel next week. I'll look into 
it and post my results here.

signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Tom Lane

Robert Haas  writes:
> On Fri, Apr 14, 2017 at 4:28 AM, Kyotaro HORIGUCHI
>  wrote:
>> Similariliy, these columns may need renaming.

> Personally, I would be inclined not to tinker with this, not just
> because we're after freeze but because it doesn't seem like an
> improvement to me.  Referring to an LSN as  location seems fine to me;
> I mean, granted, it's one specific kind of location, but that doesn't
> make it wrong.

In a green field it would be perfectly fine --- but I think Kyotaro-san's
point is about consistency.  If all the other exposed names that involve
the same concept use "lsn", then it's fair to say that it's a bad idea
for these four column names to be randomly different from the rest.

Now this is a pre-existing problem: those column names existed in 9.6,
and so did some of the ones named using "lsn".  But we've added more
of the latter in v10.  I think the real problem right now is that nobody
has a rule to follow about which way to name new exposed references to
the concept, and that's simply bad.

It's possible that we should say that backwards compatibility outweighs
consistency and therefore it's too late to change these names.  But
I think your argument above is missing the point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher set application_name?

2017-04-14 Thread Petr Jelinek

On 12/04/17 02:32, Robert Haas wrote:
> On Tue, Apr 11, 2017 at 8:11 PM, Michael Paquier
>  wrote:
>> On Wed, Apr 12, 2017 at 3:40 AM, Tom Lane  wrote:
>>> I notice looking at pg_stat_activity that the logical replication launcher
>>> sets its application_name to "logical replication launcher".  This seems
>>> inconsistent (no other standard background process sets application_name),
>>> redundant with other columns that already tell you what it is, and an
>>> unreasonable consumption of horizontal space in the tabular output.
>>> Can we drop that?  If we do have to have something like that, what about
>>> putting it in the "query" field where it's much less likely to be
>>> substantially wider than any other entry in the column?
>>
>> It seems to me that the logic behind that is to be able to identify
>> easily the logical replication launcher in pg_stat_activity, so using
>> the query field instead sounds fine for me as we need another way than
>> backend_type to guess what is this bgworker.
> 
> Wait, why do we need two ways?
> 

What do you mean by two ways? There is no way if we don't set anything.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Petr Jelinek

On 12/04/17 15:55, Fujii Masao wrote:
> Hi,
> 
> When I shut down the publisher while I repeated creating and dropping
> the subscription in the subscriber, the publisher emitted the following
> PANIC error during shutdown checkpoint.
> 
> PANIC:  concurrent transaction log activity while database system is
> shutting down
> 
> The cause of this problem is that walsender for logical replication can
> generate WAL records even during shutdown checkpoint.
> 
> Firstly walsender keeps running until shutdown checkpoint finishes
> so that all the WAL including shutdown checkpoint record can be
> replicated to the standby. This was safe because previously walsender
> could not generate WAL records. However this assumption became
> invalid because of logical replication. That is, currenty walsender for
> logical replication can generate WAL records, for example, by executing
> CREATE_REPLICATION_SLOT command. This is an oversight in
> logical replication patch, I think.

Hmm, but CREATE_REPLICATION_SLOT should not generate WAL afaik. I agree
that the issue with walsender still exist (since we now allow normal SQL
to run there) but I think it's important to identify what exactly causes
the WAL activity in your case - if it's one of the replication commands
and not SQL then we'll need to backpatch any solution we come up with to
9.4, if it's not replication command, we only need to fix 10.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Apr 14, 2017 at 9:09 AM, Stephen Frost  wrote:
> > I agree that the documentation could be improved here.
> 
> This does not seem like it's only a documentation problem.  There
> seems to be no principled reason why a grant for ALL shouldn't have
> exactly the same effect as one grant per relevant operation type.

I agreed already up-thread that there's an issue there and will be
looking to fix it.  That comment was simply replying to Rod's point that
the documentation could also be improved.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Robert Haas

On Fri, Apr 14, 2017 at 9:09 AM, Stephen Frost  wrote:
> I agree that the documentation could be improved here.

This does not seem like it's only a documentation problem.  There
seems to be no principled reason why a grant for ALL shouldn't have
exactly the same effect as one grant per relevant operation type.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Robert Haas

On Fri, Apr 14, 2017 at 4:28 AM, Kyotaro HORIGUCHI
 wrote:
> Similariliy, these columns may need renaming.
>
> s=# select attname, attrelid::regclass from pg_attribute where attname like 
> '%location%';
>  attname |  attrelid
> -+-
>  sent_location   | pg_stat_replication
>  write_location  | pg_stat_replication
>  flush_location  | pg_stat_replication
>  replay_location | pg_stat_replication
> (4 rows)

Personally, I would be inclined not to tinker with this, not just
because we're after freeze but because it doesn't seem like an
improvement to me.  Referring to an LSN as  location seems fine to me;
I mean, granted, it's one specific kind of location, but that doesn't
make it wrong.  If somebody calls you and says "let's meet up", and
you say "sure, what's your location?" they might give you a street
address or GPS coordinates or the name of a nearby point of interest,
depending on what information they have easily available, but that's
cool; those things all let you find them.  Similarly, an LSN lets you
find a particular point within a WAL stream, but I don't think that
makes it not a location.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Stephen Frost

Rod,

* Rod Taylor (rod.tay...@gmail.com) wrote:
> My actual use-case involves a range. Most users can see and manipulate the
> record when CURRENT_TIMESTAMP is within active_period. Some users
> (staff/account admins) can see recently dead records too. And a 3rd group
> (senior staff) have no time restriction, though there are a few customers
> they cannot see due to their information being a touch more sensitive.
> I've simplified the below rules to just deal with active_period and the
> majority of user view (@> CURRENT_TIMESTAMP).

Interesting.

> NOTE: the active_period range is '[)' by default, so records with upper() =
> CURRENT_TIMESTAMP are not visible with @> CURRENT_TIMESTAMP restriction.

Is that really what you intend/want though?  For records with
upper() = CURRENT_TIMESTAMP to not be visible?  You are able to change
the range returned from tstzrange by specifying what you want, eg:

select tstzrange(current_timestamp, current_timestamp, '[]');

> CREATE A TABLE t (id integer, active_period tstzrange NOT NULL DEFAULT
> tstzrange(current_timestamp, NULL));

Why NULL instead of 'infinity'...?

> -- Disallowed due to hide_old_select policy.
> UPDATE t SET active_period = tstzrange(lower(active_period),
> CURRENT_TIMESTAMP);

Guess I'm still trying to figure out if you really intend for this to
make the records invisible to the 'most users' case.

> I'm happy to help with testing and documentation but first I need to
> understand what the intended functionality was. Right now it seems
> inconsistent between the simple single policy version and the multi policy
> version; the docs imply the single policy version is correct (it only seems
> to mention SELECT checks on RETURNING clauses).

I agree that the documentation could be improved here.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [POC] hash partitioning

2017-04-14 Thread Robert Haas

On Fri, Apr 14, 2017 at 4:23 AM, Yugo Nagata  wrote:
> On Thu, 13 Apr 2017 16:40:29 -0400
> Robert Haas  wrote:
>> On Fri, Mar 17, 2017 at 7:57 AM, Yugo Nagata  wrote:
>> > I also understanded that my design has a problem during pg_dump and
>> > pg_upgrade, and that some information to identify the partition
>> > is required not depending the command order. However, I feel that
>> > Amul's design is a bit complicated with the rule to specify modulus.
>> >
>> > I think we can use simpler syntax, for example, as below.
>> >
>> >  CREATE TABLE h1 PARTITION OF h FOR (0);
>> >  CREATE TABLE h2 PARTITION OF h FOR (1);
>> >  CREATE TABLE h3 PARTITION OF h FOR (2);
>>
>> I don't see how that can possibly work.  Until you see all the table
>> partitions, you don't know what the partitioning constraint for any
>> given partition should be, which seems to me to be a fatal problem.
>
> If a partition has an id, the partitioning constraint can be written as
>
>  hash_func(hash_key) % N = id
>
> wehre N is the number of paritions. Doesn't it work?

Only if you know the number of partitions.  But with your syntax,
after seeing only the first of the CREATE TABLE .. PARTITION OF
commands, what should the partition constraint be?  It depends on how
many more such commands appear later in the dump file, which you do
not know at that point.

>> I agree that Amul's syntax - really, I proposed it to him - is not the
>> simplest, but I think all the details needed to reconstruct the
>> partitioning constraint need to be explicit.  Otherwise, I'm pretty
>> sure things we're going to have lots of problems that we can't really
>> solve cleanly.  We can later invent convenience syntax that makes
>> common configurations easier to set up, but we should invent the
>> syntax that spells out all the details first.
>
> I have a question about Amul's syntax. After we create partitions
> as followings,
>
>  create table foo (a integer, b text) partition by hash (a);
>  create table foo1 partition of foo with (modulus 2, remainder 0);
>  create table foo2 partition of foo with (modulus 2, remainder 1);
>
> we cannot create any additional partitions for the partition.
>
> Then, after inserting records into foo1 and foo2, how we can
> increase the number of partitions?

You can detach foo1, create two new partitions with modulus 4 and
remainders 0 and 2, and move the data over from the old partition.

I realize that's not as automated as you might like, but it's no worse
than what is currently required for list and range partitioning when
you split a partition.  Someday we might build in tools to do that
kind of data migration automatically, but right now we have none.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor typo in partition.c

2017-04-14 Thread Robert Haas

On Fri, Apr 14, 2017 at 3:43 AM, Simon Riggs  wrote:
> Oh, just looks very different to what we discussed, so I presumed more
> changes were coming.

I waited quite a while for you to review Amit's patches on that
thread, but as you never did, I eventually picked it up.

> Where shall I mention BRIN in that chapter?

You never answered my email about why BRIN belongs in that chapter.  I
maintain that it doesn't, because BRIN is not a partitioning method.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

2017-04-14 Thread Petr Jelinek

On 14/04/17 14:30, Michael Paquier wrote:
> On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
>  wrote:
>> I am not quite sure adding more GUCs is all that great option. When
>> writing the patches I was wondering if we should perhaps rename the
>> wal_receiver_timeout and wal_retrieve_retry_interval to something that
>> makes more sense for both physical and logical replication though.
> 
> It seems to me that you should really have a different GUC,
> wal_retrieve_retry_interval has been designed to work in the startup
> process, and I think that it should still only behave as originally
> designed.

Ah yeah I am actually confusing it with wal_receiver_timeout which
behaves same for wal_receiver and subscription worker. So yeah it makes
sense to have separate GUC (I wonder if we then need yet another one for
tablesync though since both of those will be controlling restarts of
subscription workers after crash).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for pg_basebackup argument parsing

2017-04-14 Thread Robert Haas

On Fri, Apr 14, 2017 at 2:32 AM, Pierre Ducroquet  wrote:
> On Friday, April 14, 2017 8:44:37 AM CEST Michael Paquier wrote:
>> On Fri, Apr 14, 2017 at 6:32 AM, Pierre Ducroquet 
> wrote:
>> > Yesterday while doing a few pg_basebackup, I realized that the integer
>> > parameters were not properly checked against invalid input.
>> > It is not a critical issue, but this could be misleading for an user who
>> > writes -z max or -s 0.5…
>> > I've attached the patch to this mail. Should I add it to the next commit
>> > fest or is it not needed for such small patches ?
>>
>> A call to atoi is actually equivalent to strtol with the rounding:
>> (int)strtol(str, (char **)NULL, 10);
>> So I don't think this is worth caring.
>
> The problem with atoi is that it simply ignores any invalid input and returns
> 0 instead.
> That's why I did this patch, because I did a typo when calling pg_basebackup
> and was not warned for an invalid input.

I agree.  I think it would be worth going through and cleaning up
every instance of this in the source tree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2017-04-14 Thread Robert Haas

On Fri, Apr 14, 2017 at 1:19 AM, Peter Geoghegan  wrote:
> Interestingly enough, I think that Knuth was pretty much spot on with
> his "sweet spot" of 7 tapes, even if you have modern hardware. Commit
> df700e6 (where the sweet spot of merge order 7 was no longer always
> used) was effective because it masked certain overheads that we
> experience when doing multiple passes, overheads that Heikki and I
> mostly removed. This was confirmed by Robert's testing of my merge
> order cap work for commit fc19c18, where he found that using 7 tapes
> was only slightly worse than using many hundreds of tapes. If we could
> somehow be completely effective in making access to logical tapes
> perfectly sequential, then 7 tapes would probably be noticeably
> *faster*, due to CPU caching effects.

I don't think there's any one fixed answer, because increasing the
number of tapes reduces I/O by adding CPU cost, and visca versa.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Different table schema in logical replication crashes

2017-04-14 Thread Petr Jelinek

On 13/04/17 05:04, Euler Taveira wrote:
> Hi,
> 
> If a certain table has different schemas and the subscriber table has an
> unmatched column with a not null constraint, the logical replication
> crashes with the above stack trace.
> 
> [snip]
> 
> Are we prepared to support different schemas in v10? Or should we
> disallow it for v10 and add a TODO?
> 

Ah nuts, yes it's supposed to be supported, we seem to not initialize
cstate->range_table in tablesync which causes this bug. The CopyState
struct is private to copy.c so we can't easily set cstate->range_table
externally. I wonder if tablesync should just construct CopyStmt instead
of calling the lower level API.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication and inheritance

2017-04-14 Thread Petr Jelinek

On 14/04/17 06:14, Amit Langote wrote:
> On 2017/04/14 10:57, Petr Jelinek wrote:
>> I don't think inheritance and partitioning should behave the same in
>> terms of logical replication.
> 
> I see.
> 
>>
>> For me the current behavior with inherited tables is correct.
> 
> OK.
> 
> By the way, what do you think about the pg_dump example/issue I mentioned?
>  Is that a pg_dump problem or backend?  To reiterate, if you add an
> inheritance parent to a publication, dump the database, and restore it
> into another, an error occurs.  Why?  Because every child table is added
> *twice* because of the way publication tables are dumped.  Once by itself
> and again via inheritance expansion when the parent is added.  Adding a
> table again to the same publication is currently an error, which I was
> wondering if it could be made a no-op instead.
> 

That's good question. I think it's fair to treat membership of table in
publication is "soft object" or "property" rather than real object where
we would enforce error on ADD of something that's already there. So I am
not against changing it to no-op (like doing alter sequence owned by to
column which is the current owner already).

>> What I would like partitioned tables support to look like is that if we
>> add partitioned table, the data decoded from any of the partitions would
>> be sent as if the change was for that partitioned table so that the
>> partitioning scheme on subscriber does not need to be same as on
>> publisher. That's nontrivial to do though probably.
> 
> I agree that it'd be nontrivial.  I'm not sure if you're also implying
> that a row decoded from a partition is *never* published as having been
> inserted into the partition itself.  A row can end up in a partition via
> both the inserts into the partitioned table and the partition itself.
> Also, AFAICT, obviously the output pluggin would have to have a dedicated
> logic to determine which table to publish a given row as coming from
> (possibly the root partitioned table), since nothing decode-able from WAL
> is going to have that information.
> 

Well, yes that what I mean by nontrivial, IMHO the hard part is defining
behavior (the coding is the easy part here). I think there are more or
less 3 options, a) either partitions can't be added to publication
individually or b) they will always publish their data as their main
partitioned table (which for output plugin means same thing, ie we'll
only see the rows as changes to partitioned tables) or alternatively c)
if partition is added and partitioned table is added we publish changes
twice, but that seems like quite bad option to me.

This was BTW the reason why I was saying in the original partitioning
thread that it's unclear to me from documentation what is general
guiding principle in terms of threating partitions as individual objects
or not. Currently it's mixed bag, structure is treated as global for
whole partitioned tree, but things like indexes can be defined
separately on individual partitions. Also existing tables retain some of
their differences when they are being added as partitions but other
differences are strictly checked and will result in error. I don't quite
understand if this is current implementation limitation and we'll
eventually "lock down" the differences (when we have global indexes and
such) or if it's intended long term to allow differences between
partitions and what will be the rules for what's allowed and what not.

> Also, with the current state of partitioning, if a row decoded and
> published as coming from the partitioned table had no corresponding
> partition defined on the destination server, an error will occur in the
> subscription worker I'd guess.  Or may be we don't assume anything about
> whether the table on the subscription end is partitioned or not.
> 
> Anyway, that perhaps also means that for time being, we might need to go
> with the following option that Robert mentioned (I suppose strictly in the
> context of partitioned tables, not general inheritance):
> 
> (1) That's an error; the user should publish the partitions instead.
> 

Yes I agree with Robert's statement and that's how it should behave already.

> That is, we should make adding a partitioned table to a publication a user
> error (feature not supported).
> 

It already should be error no? (Unless some change was committed that I
missed, I definitely wrote it as error in original patch).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Rod Taylor

On Fri, Apr 14, 2017 at 7:41 AM, Rod Taylor  wrote:

>
>
>
> On Thu, Apr 13, 2017 at 5:31 PM, Stephen Frost  wrote:
>
>> Rod, all,
>>
>> * Joe Conway (m...@joeconway.com) wrote:
>> > On 04/13/2017 01:31 PM, Stephen Frost wrote:
>> > > * Robert Haas (robertmh...@gmail.com) wrote:
>> > >> On Thu, Apr 6, 2017 at 4:05 PM, Rod Taylor 
>> wrote:
>> > >> > I'm a little confused on why a SELECT policy fires against the NEW
>> record
>> > >> > for an UPDATE when using multiple FOR policies. The ALL policy
>> doesn't seem
>> > >> > to have that restriction.
>> > >>
>> > >> My guess is that you have found a bug.
>> > >
>> > > Indeed.  Joe's been looking into it and I'm hoping to find some time
>> to
>> > > dig into it shortly.
>> >
>> > >> CREATE POLICY split_select ON t FOR SELECT TO split
>> > >> USING (value > 0);
>> > >> CREATE POLICY split_update ON t FOR UPDATE TO split
>> > >> USING (true) WITH CHECK (true);
>> >
>> > Yes -- from what I can see in gdb:
>>
>> Actually, looking at this again, the complaint appears to be that you
>> can't "give away" records.  That was a topic of much discussion and I'm
>> reasonably sure that was what we ended up deciding made the most sense.
>> You have to be able to see records to be able to update them (you can't
>> update records you can't see), and you have to be able to see the result
>> of your update.  I don't doubt that we could improve the documentation
>> around this (and apparently the code comments, according to Joe..).
>>
>>
> Then there is a bug in the simpler statement which happily lets you give
> away records.
>
> CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK
> (true);
>
> SET session authorization simple;
> SELECT * FROM t;
> UPDATE t SET value = value * -1 WHERE value = 1;
> -- No error and value is -1 at the end.
>

My actual use-case involves a range. Most users can see and manipulate the
record when CURRENT_TIMESTAMP is within active_period. Some users
(staff/account admins) can see recently dead records too. And a 3rd group
(senior staff) have no time restriction, though there are a few customers
they cannot see due to their information being a touch more sensitive.
I've simplified the below rules to just deal with active_period and the
majority of user view (@> CURRENT_TIMESTAMP).

NOTE: the active_period range is '[)' by default, so records with upper() =
CURRENT_TIMESTAMP are not visible with @> CURRENT_TIMESTAMP restriction.

CREATE A TABLE t (id integer, active_period tstzrange NOT NULL DEFAULT
tstzrange(current_timestamp, NULL));

The below policy is allowed but requires that 1ms slop to accommodate the wi

Updated record invisible to USING but requires a trigger to enforce
specific upper
and starting values. I have a trigger enforcing specific upper/lower values
for the range
for specific ROLEs. So I had the thought that I might move ROLE specific
trigger logic into
the RLS mechanism.

CREATE POLICY hide_old ON t TO s;
  USING ( active_period @> CURRENT_TIMESTAMP)
 WITH CHECK ( active_period && tstzrange(current_timestamp - interval
'0.001 seconds', current_timestamp, '[]'));

-- This is effectively a delete for the above policy. It becomes
immediately invisible due to USING restriction.
UPDATE t SET active_period = tstzrange(lower(active_period),
CURRENT_TIMESTAMP);
SELECT count(*) FROM t; -- 0 records

I tried to tighten the above rules, so INSERT must have upper of NULL and
UPDATE must
set upper to exactly CURRENT_TIMESTAMP. Clearly I can achieve this using
triggers for
enforcement but I tried to abuse RLS instead because it is a role specific
restriction.

I was surprised when hide_old_select->USING was preventing the UPDATE when
the simple single policy version let it through.

CREATE POLICY hide_old_select ON t FOR SELECT TO s
  USING ( active_period @> CURRENT_TIMESTAMP);
CREATE POLICY hide_old_insert ON t FOR INSERT to s
 WITH CHECK ( lower(active_period) = CURRENT_TIMESTAMP AND
upper(active_period) IS NULL);

CREATE POLICY hide_old_update ON t FOR UPDATE TO s
  USING ( active_period @> CURRENT_TIMESTAMP)
 WITH CHECK ( upper(active_period) = CURRENT_TIMESTAMP);

-- Disallowed due to hide_old_select policy.
UPDATE t SET active_period = tstzrange(lower(active_period),
CURRENT_TIMESTAMP);

I'm happy to help with testing and documentation but first I need to
understand what the intended functionality was. Right now it seems
inconsistent between the simple single policy version and the multi policy
version; the docs imply the single policy version is correct (it only seems
to mention SELECT checks on RETURNING clauses).

-- 
Rod Taylor

Re: [HACKERS] Allowing extended stats on foreign and partitioned tables

2017-04-14 Thread Alvaro Herrera

Noah Misch wrote:
> On Mon, Apr 10, 2017 at 09:39:22PM +1200, David Rowley wrote:
> > While reviewing extended stats I noticed that it was possible to
> > create extended stats on many object types, including sequences. I
> > mentioned that this should be disallowed. Statistics were then changed
> > to be only allowed on plain tables and materialized views.
> > 
> > This should be relaxed again to allow anything ANALYZE is allowed on.
> > 
> > The attached does this.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Alvaro,
> since you committed the patch believed to have created it, you own this open
> item.

I'm going to review David's patch and post a detailed review on
Wednesday 19th.

I think the patch is reasonable, but it's lacking new tests, and I'm
worried that the error message changes do not affect the existing tests
in the area.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

2017-04-14 Thread Masahiko Sawada

On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
 wrote:
> On 14/04/17 12:57, Masahiko Sawada wrote:
>> Hi,
>>
>> I noticed that the logical replication launcher uses
>> wal_retrieve_retry_interval as a interval of launching logical
>> replication worker process. This behavior is not documented and I
>> guess this is no longer consistent with what its name means.
>>
>
> Yes that was done based on reviews (and based on general attitude of not
> adding more knobs that are similar in meaning). It is briefly documented
> in the replication config section. Same is true for wal_receiver_timeout
> btw.

Thank you for the comment!

These two parameters are classed as a standby server parameter in the
document. We might want to fix it.

>> I think that we should either introduce a new GUC parameter (say
>> logical_replication_retry_interval?) for this or update the
>> description of wal_retrieve_retry_interval. IMO the former is better.
>>
>
> I am not quite sure adding more GUCs is all that great option. When
> writing the patches I was wondering if we should perhaps rename the
> wal_receiver_timeout and wal_retrieve_retry_interval to something that
> makes more sense for both physical and logical replication though.
>

It would work but IMO having multiple different behaviors for one GUC
parameter is not good design.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

2017-04-14 Thread Michael Paquier

On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
 wrote:
> I am not quite sure adding more GUCs is all that great option. When
> writing the patches I was wondering if we should perhaps rename the
> wal_receiver_timeout and wal_retrieve_retry_interval to something that
> makes more sense for both physical and logical replication though.

It seems to me that you should really have a different GUC,
wal_retrieve_retry_interval has been designed to work in the startup
process, and I think that it should still only behave as originally
designed. And at some point I think that it would make as well sense
to be able to make this parameter settable at worker-level.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Letting the client choose the protocol to use during a SASL exchange

2017-04-14 Thread Michael Paquier

On Fri, Apr 14, 2017 at 8:28 PM, Craig Ringer
 wrote:
> There's no point advertising scram-512 if only -256 can work for 'bob'
> because that's what we have in pg_authid.

The possibility to have multiple verifiers has other benefits than
that, password rolling being one. We may want to revisit that once
there is a need to have a pg_auth_verifiers, my intuition on the
matter is that we are years away from it, but we'll very likely need
it for more reasons than the one you are raising here.

> Yes, filtering the advertised mechs exposes info. But not being able to log
> in if you're the legitimate user without configuring the client with your
> password hash format would suck too.

Yup.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test

2017-04-14 Thread Stephen Frost

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Apr 14, 2017 at 8:03 PM, Stephen Frost  wrote:
> > Some of those were specifically left around to test those code paths.
> > I'm not sure if these were those or not though, Andrew was the one who
> > reviewed the various pg_dumpall calls to add --no-sync in places.
> 
> Well, Andrew has pushed the patch I have written, and the calls of
> pg_dumpall in pg_upgrade use --no-sync. The ones intentionally left
> are in src/bin/pg_dump/t/002_pg_dump.pl.

Ok. :)

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test

2017-04-14 Thread Michael Paquier

On Fri, Apr 14, 2017 at 8:03 PM, Stephen Frost  wrote:
> Some of those were specifically left around to test those code paths.
> I'm not sure if these were those or not though, Andrew was the one who
> reviewed the various pg_dumpall calls to add --no-sync in places.

Well, Andrew has pushed the patch I have written, and the calls of
pg_dumpall in pg_upgrade use --no-sync. The ones intentionally left
are in src/bin/pg_dump/t/002_pg_dump.pl.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

2017-04-14 Thread Petr Jelinek

On 14/04/17 12:57, Masahiko Sawada wrote:
> Hi,
> 
> I noticed that the logical replication launcher uses
> wal_retrieve_retry_interval as a interval of launching logical
> replication worker process. This behavior is not documented and I
> guess this is no longer consistent with what its name means.
> 

Yes that was done based on reviews (and based on general attitude of not
adding more knobs that are similar in meaning). It is briefly documented
in the replication config section. Same is true for wal_receiver_timeout
btw.

> I think that we should either introduce a new GUC parameter (say
> logical_replication_retry_interval?) for this or update the
> description of wal_retrieve_retry_interval. IMO the former is better.
> 

I am not quite sure adding more GUCs is all that great option. When
writing the patches I was wondering if we should perhaps rename the
wal_receiver_timeout and wal_retrieve_retry_interval to something that
makes more sense for both physical and logical replication though.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interval for launching the table sync worker

2017-04-14 Thread Petr Jelinek

On 14/04/17 12:18, Masahiko Sawada wrote:
> On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
>  wrote:
>> On 13/04/17 12:23, Masahiko Sawada wrote:
>>> On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada  
>>> wrote:
 On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
  wrote:
> On 4/12/17 00:48, Masahiko Sawada wrote:
>> On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
>>> Perhaps instead of a global last_start_time, we store a per relation
>>> last_start_time in SubscriptionRelState?
>>
>> I was thinking the same. But a problem is that the list of
>> SubscriptionRelState is refreshed whenever the syncing table state
>> becomes invalid (table_state_valid = false). I guess we need to
>> improve these logic including GetSubscriptionNotReadyRelations().
>
> The table states are invalidated on a syscache callback from
> pg_subscription_rel, which happens roughly speaking when a table
> finishes the initial sync.  So if we're worried about failing tablesync
> workers relaunching to quickly, this would only be a problem if a
> tablesync of another table finishes right in that restart window.  That
> doesn't seem a terrible issue to me.
>

 I think the table states are invalidated whenever the table sync
 worker starts, because the table sync worker updates its status of
 pg_subscription_rel and commits it before starting actual copy. So we
 cannot rely on that. I thought we can store last_start_time into
 pg_subscription_rel but it might be overkill. I'm now thinking to
 change GetSubscriptionNotReadyRealtions so that last_start_time in
 SubscriptionRelState is taken over to new list.

>>>
>>> Attached the latest patch. It didn't actually necessary to change
>>> GetSubscriptionNotReadyRelations. I just changed the logic refreshing
>>> the sync table state list.
>>> Please give me feedback.
>>>
>>
>> Hmm this might work. Although I was actually wondering if we could store
>> the last start timestamp in the worker shared memory and do some magic
>> with that (ie, not clearing subid and relid and try to then do rate
>> limiting based on subid+relid+timestamp stored in shmem). That would
>> then work same way for the main apply workers as well. It would have the
>> disadvantage that if some tables were consistently failing, no other
>> tables could get synchronized as the amount of workers is limited.
> 
> Hmm I guess that it's not a good design that a table sync worker and a
> apply worker for a relation takes sole possession of a worker slot
> until it successes. It would bother each other.
> 

Not sure what you mean by apply worker for relation, I meant the main
subscription apply worker, the "per relation apply worker" is same as
tablesync worker.

Thinking about the possible failure scenarios more, I guess you are
right. Because if there is "global" event for the publisher/subscriber
connection (ie network goes down or something) it will affect the main
worker as well so it won't launch anything. If there is table specific
issue it will affect only that table.

The corner-cases with doing it per table where we launch tablesync
needlessly are pretty much just a) connection limit on publisher
reached, b) slot limit  on publisher reached, that's probably okay. We
might be able to catch these two specifically and do some global
limiting based on those (something like your original patch except the
start-time global variable would only be set on specific error and
stored in shmem), but that does not need to be solved immediately.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Stephen Frost

Rod,

* Rod Taylor (rod.tay...@gmail.com) wrote:
> Then there is a bug in the simpler statement which happily lets you give
> away records.
> 
> CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK (true);
> 
> SET session authorization simple;
> SELECT * FROM t;
> UPDATE t SET value = value * -1 WHERE value = 1;
> -- No error and value is -1 at the end.

Hm, that does seem like it's not matching up with the intent, likely
because it's an 'ALL' policy instead of individual policies.

Out of curiosity, is there a particular use-case here that you're
working towards, or just testing it out to see how it works?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Rod Taylor

On Thu, Apr 13, 2017 at 5:31 PM, Stephen Frost  wrote:

> Rod, all,
>
> * Joe Conway (m...@joeconway.com) wrote:
> > On 04/13/2017 01:31 PM, Stephen Frost wrote:
> > > * Robert Haas (robertmh...@gmail.com) wrote:
> > >> On Thu, Apr 6, 2017 at 4:05 PM, Rod Taylor 
> wrote:
> > >> > I'm a little confused on why a SELECT policy fires against the NEW
> record
> > >> > for an UPDATE when using multiple FOR policies. The ALL policy
> doesn't seem
> > >> > to have that restriction.
> > >>
> > >> My guess is that you have found a bug.
> > >
> > > Indeed.  Joe's been looking into it and I'm hoping to find some time to
> > > dig into it shortly.
> >
> > >> CREATE POLICY split_select ON t FOR SELECT TO split
> > >> USING (value > 0);
> > >> CREATE POLICY split_update ON t FOR UPDATE TO split
> > >> USING (true) WITH CHECK (true);
> >
> > Yes -- from what I can see in gdb:
>
> Actually, looking at this again, the complaint appears to be that you
> can't "give away" records.  That was a topic of much discussion and I'm
> reasonably sure that was what we ended up deciding made the most sense.
> You have to be able to see records to be able to update them (you can't
> update records you can't see), and you have to be able to see the result
> of your update.  I don't doubt that we could improve the documentation
> around this (and apparently the code comments, according to Joe..).
>
>
Then there is a bug in the simpler statement which happily lets you give
away records.

CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK (true);

SET session authorization simple;
SELECT * FROM t;
UPDATE t SET value = value * -1 WHERE value = 1;
-- No error and value is -1 at the end.



-- 
Rod Taylor

Re: [HACKERS] Letting the client choose the protocol to use during a SASL exchange

2017-04-14 Thread Craig Ringer

On 14 Apr. 2017 10:44, "Michael Paquier"  wrote:

On Fri, Apr 14, 2017 at 1:37 AM, Heikki Linnakangas  wrote:
> On 04/13/2017 05:53 AM, Michael Paquier wrote:
>> +* Parse the list of SASL authentication mechanisms in the
>> +* AuthenticationSASL message, and select the best mechanism that we
>> +* support.  (Only SCRAM-SHA-256 is supported at the moment.)
>>  */
>> -   if (strcmp(auth_mechanism, SCRAM_SHA256_NAME) == 0)
>> +   for (;;)
>> Just an idea here: being able to enforce the selection with an
>> environment variable (useful for testing as well in the future).
>
> Hmm. It wouldn't do much, as long as SCRAM-SHA-256 is the only supported
> mechanism. In general, there is no way to tell libpq to e.g. not do plain
> password authentication, which is more pressing than choosing a particular
> SASL mechanism. So I think we should have libpq options to control that,
but
> it's a bigger feature than just adding a debug environment variable here.

Of course, my last sentence implied that this may be useful once more
than 1 mechanism is added. This definitely cannot be a connection
parameter. Your last sentence makes me guess that we agree on that.
But those are thoughts for later..

Are we going to have issues with with mech negotiation re the ability to
store auth data for >1 mech and access it early enough?

Presumably we'll need multiple digests for a user. For example if we want
to allow the choice of mechs scram-256 and scram-512 we need different
stored hashes for the same user in pg_authid. And we'll possibly need to be
able to tell at the time we advertise mechs which users have creds for
which mechs otherwise we'll advertise mechs they can never succeed. The
client has no way to make a sensible choice of mech if some arbitrary
subset (maybe just 1) will work for a given user.

There's no point advertising scram-512 if only -256 can work for 'bob'
because that's what we have in pg_authid.

Yes, filtering the advertised mechs exposes info. But not being able to log
in if you're the legitimate user without configuring the client with your
password hash format would suck too.

> Thanks for the review! I've pushed these patches, after a bunch of little
> cleanups here and there, and fixing a few garden-variety bugs in the
> GSS/SSPI changes.

Committed patches look good to me after a second lookup. Thanks!
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test

2017-04-14 Thread Stephen Frost

Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Thu, Apr 6, 2017 at 7:48 AM, Michael Paquier
>  wrote:
> > On Wed, Apr 5, 2017 at 10:24 PM, Stephen Frost  wrote:
> >> * Michael Paquier (michael.paqu...@gmail.com) wrote:
> >>> 1) Initialize the old cluster and start it.
> >>> 2) create a bunch of databases with full range of ascii characters.
> >>> 3) Run regression tests.
> >>> 4) Take dump on old cluster.
> >>> 4) Stop the old cluster.
> >>> 5) Initialize the new cluster.
> >>> 6) Run pg_upgrade.
> >>> 7) Start new cluster.
> >>> 8) Take dump from it.
> >>> 9) Run deletion script (Oops forgot this part!)
> >>
> >> Presumably the check to match the old dump against the new one is also
> >> performed?
> >
> > Yes. That's run with command_ok() at the end.
> 
> Attached is an updated patch to use --no-sync with pg_dumpall calls.

Some of those were specifically left around to test those code paths.
I'm not sure if these were those or not though, Andrew was the one who
reviewed the various pg_dumpall calls to add --no-sync in places.

Thanks!

Stephen


signature.asc
Description: Digital signature

[HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

2017-04-14 Thread Masahiko Sawada

Hi,

I noticed that the logical replication launcher uses
wal_retrieve_retry_interval as a interval of launching logical
replication worker process. This behavior is not documented and I
guess this is no longer consistent with what its name means.

I think that we should either introduce a new GUC parameter (say
logical_replication_retry_interval?) for this or update the
description of wal_retrieve_retry_interval. IMO the former is better.

Thought?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interval for launching the table sync worker

2017-04-14 Thread Masahiko Sawada

On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
 wrote:
> On 13/04/17 12:23, Masahiko Sawada wrote:
>> On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada  
>> wrote:
>>> On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
>>>  wrote:
 On 4/12/17 00:48, Masahiko Sawada wrote:
> On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
>> Perhaps instead of a global last_start_time, we store a per relation
>> last_start_time in SubscriptionRelState?
>
> I was thinking the same. But a problem is that the list of
> SubscriptionRelState is refreshed whenever the syncing table state
> becomes invalid (table_state_valid = false). I guess we need to
> improve these logic including GetSubscriptionNotReadyRelations().

 The table states are invalidated on a syscache callback from
 pg_subscription_rel, which happens roughly speaking when a table
 finishes the initial sync.  So if we're worried about failing tablesync
 workers relaunching to quickly, this would only be a problem if a
 tablesync of another table finishes right in that restart window.  That
 doesn't seem a terrible issue to me.

>>>
>>> I think the table states are invalidated whenever the table sync
>>> worker starts, because the table sync worker updates its status of
>>> pg_subscription_rel and commits it before starting actual copy. So we
>>> cannot rely on that. I thought we can store last_start_time into
>>> pg_subscription_rel but it might be overkill. I'm now thinking to
>>> change GetSubscriptionNotReadyRealtions so that last_start_time in
>>> SubscriptionRelState is taken over to new list.
>>>
>>
>> Attached the latest patch. It didn't actually necessary to change
>> GetSubscriptionNotReadyRelations. I just changed the logic refreshing
>> the sync table state list.
>> Please give me feedback.
>>
>
> Hmm this might work. Although I was actually wondering if we could store
> the last start timestamp in the worker shared memory and do some magic
> with that (ie, not clearing subid and relid and try to then do rate
> limiting based on subid+relid+timestamp stored in shmem). That would
> then work same way for the main apply workers as well. It would have the
> disadvantage that if some tables were consistently failing, no other
> tables could get synchronized as the amount of workers is limited.

Hmm I guess that it's not a good design that a table sync worker and a
apply worker for a relation takes sole possession of a worker slot
until it successes. It would bother each other.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-14 Thread Kyotaro HORIGUCHI

At Mon, 10 Apr 2017 19:26:11 +1200, David Rowley  
wrote in 
> ... and of course the other functions matching *wal*location*
> 
> My thoughts here are that we're already breaking backward
> compatibility of these functions for PG10, so thought we might want to
> use this as an opportunity to fix the naming a bit more.
> 
> I feel that the "location" word not the best choice.  We also have a
> function called pg_tablespace_location() to give us the path that a
> tablespace is stored in, so I could understand anyone who's confused
> about what pg_current_wal_location() might do if they're looking at
> the function name only, and not the docs.
> 
> For me, "lsn" suits these function names much better, so I'd like to
> see what other's think.
> 
> It would be sad to miss this opportunity without at least discussing this 
> first.
> 
> The functions in question are:
> 
> postgres=# \dfS *wal*location*
>List of functions
>Schema   |  Name  | Result data type |
> Argument data types |  Type
> ++--+-+
>  pg_catalog | pg_current_wal_flush_location  | pg_lsn   |
>| normal
>  pg_catalog | pg_current_wal_insert_location | pg_lsn   |
>| normal
>  pg_catalog | pg_current_wal_location| pg_lsn   |
>| normal
>  pg_catalog | pg_last_wal_receive_location   | pg_lsn   |
>| normal
>  pg_catalog | pg_last_wal_replay_location| pg_lsn   |
>| normal
>  pg_catalog | pg_wal_location_diff   | numeric  |
> pg_lsn, pg_lsn  | normal
> (6 rows)
> 
> Opinions?

Similariliy, these columns may need renaming.

s=# select attname, attrelid::regclass from pg_attribute where attname like 
'%location%';
 attname |  attrelid   
-+-
 sent_location   | pg_stat_replication
 write_location  | pg_stat_replication
 flush_location  | pg_stat_replication
 replay_location | pg_stat_replication
(4 rows)


Currently the following functions and columns are using 'lsn'.

=# \dfS *lsn*
 List of functions
   Schema   |Name | Result data type | Argument data types |  Type  
+-+--+-+
 pg_catalog | pg_lsn_cmp  | integer  | pg_lsn, pg_lsn  | normal
 pg_catalog | pg_lsn_eq   | boolean  | pg_lsn, pg_lsn  | normal
...
 pg_catalog | pg_lsn_recv | pg_lsn   | internal| normal
 pg_catalog | pg_lsn_send | bytea| pg_lsn  | normal
(13 rows)


=# select distinct attname from pg_attribute where attname like '%lsn%';
   attname   
-
 confirmed_flush_lsn
 latest_end_lsn
 local_lsn
 receive_start_lsn
 received_lsn
 remote_lsn
 restart_lsn
 srsublsn
(8 rows)


Feature is already frozen, but this seems inconsistent a bit..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] hash partitioning

2017-04-14 Thread Yugo Nagata

On Thu, 13 Apr 2017 16:40:29 -0400
Robert Haas  wrote:

> On Fri, Mar 17, 2017 at 7:57 AM, Yugo Nagata  wrote:
> > I also understanded that my design has a problem during pg_dump and
> > pg_upgrade, and that some information to identify the partition
> > is required not depending the command order. However, I feel that
> > Amul's design is a bit complicated with the rule to specify modulus.
> >
> > I think we can use simpler syntax, for example, as below.
> >
> >  CREATE TABLE h1 PARTITION OF h FOR (0);
> >  CREATE TABLE h2 PARTITION OF h FOR (1);
> >  CREATE TABLE h3 PARTITION OF h FOR (2);
> 
> I don't see how that can possibly work.  Until you see all the table
> partitions, you don't know what the partitioning constraint for any
> given partition should be, which seems to me to be a fatal problem.

If a partition has an id, the partitioning constraint can be written as

 hash_func(hash_key) % N = id

wehre N is the number of paritions. Doesn't it work?

> I agree that Amul's syntax - really, I proposed it to him - is not the
> simplest, but I think all the details needed to reconstruct the
> partitioning constraint need to be explicit.  Otherwise, I'm pretty
> sure things we're going to have lots of problems that we can't really
> solve cleanly.  We can later invent convenience syntax that makes
> common configurations easier to set up, but we should invent the
> syntax that spells out all the details first.

I have a question about Amul's syntax. After we create partitions
as followings, 

 create table foo (a integer, b text) partition by hash (a);
 create table foo1 partition of foo with (modulus 2, remainder 0);
 create table foo2 partition of foo with (modulus 2, remainder 1);  

we cannot create any additional partitions for the partition.

Then, after inserting records into foo1 and foo2, how we can
increase the number of partitions?

> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor typo in partition.c

2017-04-14 Thread Amit Langote

On 2017/04/14 16:43, Simon Riggs wrote:
> On 14 April 2017 at 08:39, Amit Langote  wrote:
>> On 2017/04/14 16:25, Simon Riggs wrote:
>>> On 14 April 2017 at 08:13, Amit Langote  
>>> wrote:
>>>
 Attached patch gets rid of a "is has".
>>>
>>> Yes, its a typo, but doesn't add missing information or change the meaning.
>>>
>>> It should be fixed, but could I suggest we include that in your next
>>> lot of doc changes, rather than keep making single minor changes?
>>
>> Sorry, I'm not working on any new doc changes at the moment, because...
>>
>>> (I had understood that we were going to add more docs together, but I
>>> was awaiting your restructuring patch)
>>
>> The restructuring patch was committed on Apr 1st, after I posted the first
>> version of the patch on March 3rd here:
>>
>> https://www.postgresql.org/message-id/4eb8cfe4-b6c9-903d-4ce8-2a31fcee27e1%40lab.ntt.co.jp
>>
>> The committed version:
>>
>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8f18a880a5f138d4da94173d15514142331f8de6
> 
> Oh, just looks very different to what we discussed, so I presumed more
> changes were coming.

It looks different perhaps as the result of working through the review
comments:

https://www.postgresql.org/message-id/CA%2BTgmoawNTaXw2yq01U9c0FxhUnty7avz7V0di0iyiWGkR0tfg%40mail.gmail.com

> Where shall I mention BRIN in that chapter?

Maybe just append a new  right below where  ends?

I had included both BRIN and UNION ALL views under a sub-section titled
"Alternative Methods" in my original patch (March 3rd one), but removed
the whole sub-section per review comments.  Perhaps, my patch didn't do a
good enough job of describing either BRIN or UNION ALL views, that it
ended up being not convincing enough to be made part of the section.

By the way, will you also be adding details on UNION ALL views (I remember
you said you would along with BRIN)?  Actually, my patch copied verbatim
what older docs had about them that you said is too short a description.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor typo in partition.c

2017-04-14 Thread Simon Riggs

On 14 April 2017 at 08:39, Amit Langote  wrote:
> On 2017/04/14 16:25, Simon Riggs wrote:
>> On 14 April 2017 at 08:13, Amit Langote  
>> wrote:
>>
>>> Attached patch gets rid of a "is has".
>>
>> Yes, its a typo, but doesn't add missing information or change the meaning.
>>
>> It should be fixed, but could I suggest we include that in your next
>> lot of doc changes, rather than keep making single minor changes?
>
> Sorry, I'm not working on any new doc changes at the moment, because...
>
>> (I had understood that we were going to add more docs together, but I
>> was awaiting your restructuring patch)
>
> The restructuring patch was committed on Apr 1st, after I posted the first
> version of the patch on March 3rd here:
>
> https://www.postgresql.org/message-id/4eb8cfe4-b6c9-903d-4ce8-2a31fcee27e1%40lab.ntt.co.jp
>
> The committed version:
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8f18a880a5f138d4da94173d15514142331f8de6

Oh, just looks very different to what we discussed, so I presumed more
changes were coming.

Where shall I mention BRIN in that chapter?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-14 Thread Simon Riggs

On 13 April 2017 at 18:47, Fujii Masao  wrote:

> But on second thought, I don't think that reporting NULL as the priority when
> quorum-based sync replication is used is less confusing. When there is async
> standby, we report 0 as its priority when synchronous_standby_names is empty
> or a priority-based sync replication is configured. But with the patch, when
> a quorum-based one is specified, NULL is reported for that.
> Isn't this confusing?

To me, yes, it is confusing.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor typo in partition.c

2017-04-14 Thread Amit Langote

On 2017/04/14 16:25, Simon Riggs wrote:
> On 14 April 2017 at 08:13, Amit Langote  wrote:
> 
>> Attached patch gets rid of a "is has".
> 
> Yes, its a typo, but doesn't add missing information or change the meaning.
> 
> It should be fixed, but could I suggest we include that in your next
> lot of doc changes, rather than keep making single minor changes?

Sorry, I'm not working on any new doc changes at the moment, because...

> (I had understood that we were going to add more docs together, but I
> was awaiting your restructuring patch)

The restructuring patch was committed on Apr 1st, after I posted the first
version of the patch on March 3rd here:

https://www.postgresql.org/message-id/4eb8cfe4-b6c9-903d-4ce8-2a31fcee27e1%40lab.ntt.co.jp

The committed version:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8f18a880a5f138d4da94173d15514142331f8de6

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-14 Thread Kyotaro HORIGUCHI

At Fri, 14 Apr 2017 10:47:46 +0900, Masahiko Sawada  
wrote in 
> On Fri, Apr 14, 2017 at 9:38 AM, Michael Paquier
>  wrote:
> > On Fri, Apr 14, 2017 at 2:47 AM, Fujii Masao  wrote:
> >> I'm thinking that it's less confusing to report always 0 as the priority of
> >> async standby whatever the setting of synchronous_standby_names is.
> >> Thought?
> >
> > Or we could have priority being reported to NULL for async standbys as
> > well, the priority number has no meaning for them anyway...
> 
> I agree to set the same thing (priority or NULL) to all sync standby
> in a quorum set. As Fujii-san mentioned,  I also think that it means
> all standbys in a quorum set can be chosen equally. But to less
> confusion for current user I'd not like to change current behavior of
> the priority of async standby.
> 
> >
> >> If we adopt this idea, in a quorum-based sync replication, I think that
> >> the priorities of all the standbys listed in synchronous_standby_names
> >> should be 1 instead of NULL. That is, those standbys have the same
> >> (highest) priority, and which means that any of them can be chosen as
> >> sync standby. Thought?
> >
> > Mainly my fault here to suggest that standbys in a quorum set should
> > have a priority set to NULL. My 2c on the matter is that I would be
> > fine with either having the async standbys having a priority of NULL
> > or using a priority of 1 for standbys in a quorum set. Though,
> > honestly, I find that showing a priority number for something where
> > this has no real meaning is even more confusing..
> 
> This is just a thought but we can merge sync_priority and sync_state
> into one column. The sync priority can have meaning only when the
> standby is considered as a sync standby or a potential standby in
> priority-based sync replication. For example, we can show something
> like 'sync:N' as states of the sync standby and 'potential:N' as
> states of the potential standby in priority-based sync replication,
> where N means the priority. In quorum-based sync replication it is
> just 'quorum'. It breaks backward compatibility, though.

I'm not sure how the sync_priority is used, I know sync_state is
used to detect the state or soundness of a replication set.
Introducing varialbe part wouldn't be welcomed from such people.

The current shape of pg_stat_replication is as follows.

application_name | sync_priority | sync_state
-+---+
sby1 | 1 | sync
sby3 | 2 | potential
sby3 | 2 | potential
sby2 | 3 | potential

Fot this case, the following query will work.

SELECT count(*) > 0 FROM pg_stat_replication WHERE sync_state ='sync'

Maybe a bit confusing but we can use the field to show how many
hosts are required to conform the quorum. For example the case
with s_s_names = 'ANY 3 (sby1,sby2,sby3,sby4)'.

application_name | sync_priority | sync_state
-+---+
sby1 | 3 | quorum
sby4 | 3 | quorum
sby2 | 3 | quorum
sby3 | 3 | quorum
sby3 | 3 | quorum
sby5 | 0 | async

In this case, we can detect satisfaction of the quorum setup by
something like this.

SELECT count(*) >= sync_priority FROM pg_stat_replication WHERE
   sync_state='quorum' GROUP BY sync_priority;

But, maybe we should provide a means to detect the standbys
really in sync with the master. This doesn't give such
information.


We could show top N standbys as priority-1 and others as
priority-2. (Of course this requires some additional
computation.)

application_name | flush_location | sync_priority | sync_state
-++---+---
sby1 | 0/700140   | 1 | quorum
sby4 | 0/700100   | 1 | quorum
sby2 | 0/700080   | 1 | quorum
sby3 | 0/6FFF3e   | 2 | quorum
sby3 | 0/50e345   | 2 | quorum
sby5 | 0/700140   | 0 | async

In this case, the soundness of the quorum set is checked by the
following query.

SELECT count(*) > 0 FROM pg_stat_replication WHERE sync_priority > 0;

We will find the standbys 'in sync' by the following query.

SELECT application_name FROM pg_stat_replication WHERE sync_priority = 1;


If the master doesn't have enough standbys. We could show the
state as the follows.. perhaps...

application_name | flush_location | sync_priority | sync_state
-++---+---
sby1 | 0/700140   | 0 | quorum
sby4 | 0/700100   | 0 | quorum
sby5 | 0/700140   | 0 | async

Or we can use 'quorum-potential' instead of the 'quorum' above.


Or, we might be able to keep backward compatibili

Re: [HACKERS] Minor typo in partition.c

2017-04-14 Thread Simon Riggs

On 14 April 2017 at 08:13, Amit Langote  wrote:

> Attached patch gets rid of a "is has".

Yes, its a typo, but doesn't add missing information or change the meaning.

It should be fixed, but could I suggest we include that in your next
lot of doc changes, rather than keep making single minor changes?

(I had understood that we were going to add more docs together, but I
was awaiting your restructuring patch)

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor typo in partition.c

2017-04-14 Thread Amit Langote

Attached patch gets rid of a "is has".

Thanks,
Amit
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index ab891f6ef2..e0d2665a91 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -549,7 +549,7 @@ RelationBuildPartitionDesc(Relation rel)
 		{
 			int			orig_index = rbounds[i]->index;
 
-			/* If the old index is has no mapping, assign one */
+			/* If the old index has no mapping, assign one */
 			if (mapping[orig_index] == -1)
 mapping[orig_index] = next_index++;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

95 matches

Mail list logo