date:20170324

Re: [HACKERS] create_unique_path and GEQO

2017-03-24 Thread Rushabh Lathia

On Wed, Mar 22, 2017 at 3:43 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Hi,
> In create_unique_path() there's comment
> /*
>  * We must ensure path struct and subsidiary data are allocated in main
>  * planning context; otherwise GEQO memory management causes trouble.
>  */
> oldcontext = MemoryContextSwitchTo(root->planner_cxt);
>
> pathnode = makeNode(UniquePath);
>
> This means that when GEQO resets the memory context, the RelOptInfo
> for which this path is created and may be set to cheapest_unique_path
> goes away, the unique path lingers on in the planner context.
> Shouldn't we instead allocate the path in the same context as the
> RelOptInfo similar to mark_dummy_rel()?
>
>
Do you have test case, which can reproduce the issue you
explained above?


> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Rushabh Lathia

Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Craig Ringer

On 24 March 2017 at 14:07, Kuntal Ghosh  wrote:
> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>  wrote:
>> Hello,
>> In Windows, if one needs to take a dump in plain text format (this is
>> the default option, or can be specified using -Fp) with some level of
>> compression (-Z[0-9]), an output file has to
>> be specified. Otherwise, if the output is redirected to stdout, it'll
>> create a corrupted dump (cmd is set to ASCII mode, so it'll put
>> carriage returns in the file).
> To reproduce the issue, please use the following command in windows cmd:
>
> pg_dump -Z 9 test > E:\test_xu.backup
> pg_dump -Fp -Z 9 test > E:\test_xu.backup

This is a known problem. It is not specific to PostgreSQL, it affects
any software that attempts to use stdin/stdout on Windows via cmd,
where it is not 8-bit clean.

We don't just refuse to run with stdout as a destination because it's
perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
as I know, tell whether it's being invoked by cmd or something else.

If you have concrete ideas on how to improve this they'd be welcomed.
Is there anywhere you expected to find info in the docs? Do you know
of a way to detect in Windows if the output stream is not 8-bit clean
from within the application program? ... other?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Proposal] Make the optimiser aware of partitions ordering

2017-03-24 Thread Ashutosh Bapat

On Mon, Mar 20, 2017 at 8:33 PM, Ronan Dunklau  wrote:
> On lundi 20 mars 2017 15:52:03 CET Robert Haas wrote:
>> On Mon, Mar 20, 2017 at 6:31 AM, Ronan Dunklau 
> wrote:
>> > With range partitioning, we guarantee that each partition contains non-
>> > overlapping values. Since we know the range allowed for each partition, it
>> > is possible to sort them according to the partition key (as is done
>> > already for looking up partitions).
>> >
>> > Thus, we ca generate sorted Append plans instead of MergeAppend when
>> > sorting occurs on the partition key.
>>
>> Great idea.  This is too late for v10 at this point, but please add it
>> to the next CommitFest so we don't forget about it.
>
> I know it is too late, and thought that it was too early to add it to the
> commitfest properly since so many design decisions should be discussed. Thanks
> for the feedback, I added it.

Thanks for working on this. I was also thinking about the same.

I will try to get back to review/work with this for v11. Mean time, I
am working on partition-wise joins [1]. In those patches, I have added
a structure called PartitionScheme, which represents how a relation is
partitioned. For base relations it's mostly copy of PartitionDesc and
PartitionKey, but then it bubbles up across join, with each
partitioned join getting relevant partitioning scheme. If you could
base your design such that is uses PartitionScheme, it could be used
for joins and probably when Jeevan's patch for partition-wise
aggregate [2] comes along, it can be used with grouping.

[1] 
https://www.postgresql.org/message-id/CAFjFpRcMWwepj-Do1otxQ-GApGPSZ1FmH7YQvQTwzQOGczq_sw%40mail.gmail.com
[2] http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg308861.html

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in hot standby for hash indexes

2017-03-24 Thread Amit Kapila

On Fri, Mar 24, 2017 at 12:25 PM, Ashutosh Sharma  wrote:
>>
>> I think it would have been probably okay to use *int* for ntuples as
>> that matches with what you are actually assigning in the function.
>
> okay, corrected it. Attached is newer version of patch.
>

Thanks, this version looks good to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-24 Thread Amit Kapila

On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma  wrote:
> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma  
> wrote:
>>
>> Thanks for reviewing my patch. I have removed the extra white space.
>> Attached are both the patches.
>
> Sorry, I have mistakenly attached wrong patch. Here are the correct
> set of patches.
>

The latest version of patches looks fine to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-24 Thread Amit Khandekar

On 24 March 2017 at 13:11, Rajkumar Raghuwanshi
 wrote:
> I have given patch on latest pg sources (on commit
> 457a4448732881b5008f7a3bcca76fc299075ac3). configure and make all
> install ran successfully, but initdb failed with below error.

> FailedAssertion("!(LWLockTranchesAllocated >=
> LWTRANCHE_FIRST_USER_DEFINED)", File: "lwlock.c", Line: 501)

Thanks for reporting, Rajkumar.

With the new PARALLEL_APPEND tranche ID, LWTRANCHE_FIRST_USER_DEFINED
value has crossed the value 64. So we need to increase the initial
size of LWLockTrancheArray from 64 to 128. Attached is the updated
patch.


ParallelAppend_v11.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-24 Thread Amit Langote

Hi Ashutosh,

On 2017/03/23 21:48, Ashutosh Bapat wrote:
>>> I have fixed all the issues reported till now.

In patch 0007, the following code in have_partkey_equi_join() looks
potentially unsafe:

/*
 * If the clause refers to different partition keys from
 * both relations, it can not be used for partition-wise join.
 */
if (ipk1 != ipk2)
continue;

/*
 * The clause allows partition-wise join if only it uses the same
 * operator family as that specified by the partition key.
 */
if (!list_member_oid(rinfo->mergeopfamilies,
 part_scheme->partopfamily[ipk1]))
continue;

What if ipk1 and ipk2 both turn out to be -1? Accessing
part_schem->partopfamily[ipk1] would be incorrect, no?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Michael Banck

Hi,

On Fri, Mar 24, 2017 at 12:32:28AM +0100, Petr Jelinek wrote:
> Yes, I also forgot to check if the table actually exists on subscriber
> when fetching them in CREATE SUBSCRIPTION (we have check during
> replication but not there).
> 
> Attached patches should fix both issues.

I no longer get a segfault, thanks.

However, replication also seems to not work, I'm using the following
script right now:

--8<--
#!/bin/sh

set -x

#rm -rf data_* pg*.log

initdb --pgdata=data_pg1 1> /dev/null 2>&1
sed -i -e 's/^#wal_level.=.replica/wal_level = logical/'
data_pg1/postgresql.conf
pg_ctl --pgdata=data_pg1 -l pg1.log start 1> /dev/null
psql -c "CREATE TABLE t1(id int);"
pg_basebackup --pgdata=data_pg2
sed -i -e 's/^#port.=.5432/port = 5433/' data_pg2/postgresql.conf
psql -c "INSERT INTO t1 VALUES(1);"
pg_ctl --pgdata=data_pg2 -l pg2.log start 1> /dev/null
psql -c "CREATE PUBLICATION pub1;"
psql --port=5433 -c "CREATE SUBSCRIPTION sub1 CONNECTION
'dbname=postgres' PUBLICATION pub1 WITH (COPY DATA);"
sleep 6
psql -c 'SELECT application_name, state FROM pg_stat_replication;'
psql --port=5433 -c "SELECT COUNT(*) FROM t1;"
--8<--

(I had to add the sleep 6 - 5 wasn't always enough - in order to get the
subcription from 'catchup' to 'streaming' which was a bit surprising to
me)

This is the output:

--8<--
+ initdb --pgdata=data_pg1
+ sed -i -e s/^#wal_level.=.replica/wal_level = logical/
data_pg1/postgresql.conf
+ pg_ctl --pgdata=data_pg1 -l pg1.log start
+ psql -c CREATE TABLE t1(id int);
CREATE TABLE
+ pg_basebackup --pgdata=data_pg2
+ sed -i -e s/^#port.=.5432/port = 5433/ data_pg2/postgresql.conf
+ psql -c INSERT INTO t1 VALUES(1);
INSERT 0 1
+ pg_ctl --pgdata=data_pg2 -l pg2.log start
+ psql -c CREATE PUBLICATION pub1;
CREATE PUBLICATION
+ psql --port=5433 -c CREATE SUBSCRIPTION sub1 CONNECTION
'dbname=postgres' PUBLICATION pub1 WITH (COPY DATA);
NOTICE:  created replication slot "sub1" on publisher
NOTICE:  synchronized table states
CREATE SUBSCRIPTION
+ sleep 6
+ psql -c SELECT application_name, state FROM pg_stat_replication;
 application_name |   state   
--+---
 sub1 | streaming
(1 row)

+ psql --port=5433 -c SELECT COUNT(*) FROM t1;
 count 
---
 0
(1 row)
--8<--

I would have assumed the inserted value to be visible on the standby.
If I insert further values, don't show up, either.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-03-24 Thread Fabien COELHO



Hello Peter,


I think the fix belongs into the web site CSS, so there is nothing to
commit into PostgreSQL here.


Indeed, the changes were only for the "remove nesting" solution.

I will close the commit fest entry, but I have added a section to the 
open items list so we keep track of it. 
(https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)


I put forward that the quick workaround a colleague of mine suggested (aka 
something like code code { font-size: 100%; important! }) could also be 
applied to the web site CSS while waiting for a more definite answer which 
might take some pretty unknown time close to never?


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2017-03-24 Thread Craig Ringer

On 24 March 2017 at 02:29, Robert Haas  wrote:
> On Tue, Mar 21, 2017 at 11:35 PM, Craig Ringer  wrote:
>> Changes made per discussion.
>
> Committed 0001.

Much appreciated.

Here's the 2nd patch rebased on top of master, with the TAP test
included this time. Looks ready to go.

I really appreciate the time you've taken to look at this. Point me at
anything from your team you want some outside review on.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From eca78655c966613b1b98baf9049b5c4d519d375a Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 23 Jan 2017 13:34:02 +0800
Subject: [PATCH] Introduce txid_status(bigint) to get status of an xact

If an application loses its connection while a COMMIT request is in
flight, the backend crashes mid-commit, etc, then an application may
not be sure whether or not a commit completed successfully or was
rolled back. While two-phase commit solves this it does so at a
considerable overhead, so introduce a lighter alternative.

txid_status(bigint) lets an application determine the status of a a
commit based on an xid-with-epoch as returned by txid_current() or
similar. Status may be committed, aborted, in-progress (including
prepared xacts) or null if the xact is too old for its commit status
to still be retained because it has passed the wrap-around epoch
boundary.

Applications must call txid_current() in their transactions to make
much use of this since PostgreSQL does not automatically report an xid
to the client when one is assigned.

Introduces TransactionIdInRecentPast(...) for the use of other
functions that need similar logic in future.

Authors: Craig Ringer, Robert Haas
---
 doc/src/sgml/func.sgml|  27 ++
 src/backend/utils/adt/txid.c  | 132 ++
 src/include/catalog/pg_proc.h |   2 +
 src/test/recovery/t/011_crash_recovery.pl |  46 +++
 src/test/regress/expected/txid.out|  68 +++
 src/test/regress/sql/txid.sql |  38 +
 6 files changed, 313 insertions(+)
 create mode 100644 src/test/recovery/t/011_crash_recovery.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 64f86ce..356fd33 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17523,6 +17523,10 @@ SELECT collation for ('foo' COLLATE "de_DE");
 txid_visible_in_snapshot

 
+   
+txid_status
+   
+

 The functions shown in 
 provide server transaction information in an exportable form.  The main
@@ -17573,6 +17577,11 @@ SELECT collation for ('foo' COLLATE "de_DE");
boolean
is transaction ID visible in snapshot? (do not use with subtransaction ids)
   
+  
+   txid_status(bigint)
+   txid_status
+   report the status of the given xact - committed, aborted, in progress, or NULL if the xid is too old
+  
  
 

@@ -17643,6 +17652,24 @@ SELECT collation for ('foo' COLLATE "de_DE");

 

+txid_status(bigint) reports the commit status of a recent
+transaction.  Applications may use it to determine whether a transaction
+committed or aborted when the application and database server become
+disconnected while a COMMIT is in progress.
+The status of a transaction will be reported as either
+in progress,
+committed, or aborted, provided that the
+transaction is recent enough that the system retains the commit status
+of that transaction.  If is old enough that no references to that
+transaction survive in the system and the commit status information has
+been discarded, this function will return NULL.  Note that prepared
+transactions are reported as in progress; applications must
+check pg_prepared_xacts if they
+need to determine whether the xid is a prepared transaction.
+   
+
+   
 The functions shown in 
 provide information about transactions that have been already committed.
 These functions mainly provide information about when the transactions
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 772d7c7..5c64e32 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -21,6 +21,7 @@
 
 #include "postgres.h"
 
+#include "access/clog.h"
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
@@ -28,6 +29,7 @@
 #include "miscadmin.h"
 #include "libpq/pqformat.h"
 #include "postmaster/postmaster.h"
+#include "storage/lwlock.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
@@ -93,6 +95,70 @@ load_xid_epoch(TxidEpoch *state)
 }
 
 /*
+ * Helper to get a TransactionId from a 64-bit xid with wraparound detection.
+ *
+ * It is an ERROR if the xid is in the future.  Otherwise, returns true if
+ * the transaction is still new enough that we can determine

Re: [HACKERS] identity columns

2017-03-24 Thread Vitaly Burovoy

On 3/23/17, Peter Eisentraut  wrote:
> On 3/23/17 06:09, Vitaly Burovoy wrote:
>> I think we'll end up with "DROP IDENTITY IF EXISTS" to avoid raising
>> an exception and "ADD OR SET" if your grammar remains.
>
> That sounds reasonable to me.

It would be great if "DROP IDENTITY IF EXISTS" is in the current patch
since we don't have any disagreements about "DROP IDENTITY" behavior
and easiness of implementation of "IF EXISTS" there.

>> Right. From that PoV IDENTITY also changes a default value: "SET (ADD
>> ... AS?) IDENTITY" works as setting a default to "nextval(...)"
>> whereas "DROP IDENTITY" works as setting it back to NULL.
>
> But dropping and re-adding an identity destroys state, so it's not quite
> the same.

How does dropping and re-adding affects a choosing between "ADD" and "SET"?
If you drop identity property, you get a column's state destroyed
whatever grammar variation you are using.

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

I have an idea. What about the next version of DDL:

DROP IDENTITY [ IF EXISTS ]
SET GENERATED { ALWAYS | BY DEFAULT } [ IF NOT EXISTS ] | SET ...

That version:
1. does not introduce a new "ADD" variation;
2. without "IF NOT EXISTS" follows the standard;
3. with "IF NOT EXISTS" sets a column's identity property or alters it
(works as "CREATE OR REPLACE" for functions).

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] A better way to expand hash indexes.

2017-03-24 Thread Mithun Cy

On Fri, Mar 24, 2017 at 1:22 AM, Mithun Cy  wrote:
> Hi Amit please find the new patch

The pageinspect.sgml has an example which shows the output of
"hash_metapage_info()". Since we increase the spares array and
eventually ovflpoint, I have updated the example with corresponding
values..


-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


expand_hashbucket_efficiently_05.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ANALYZE command progress checker

2017-03-24 Thread vinayak

On 2017/03/23 15:04, Haribabu Kommi wrote:

On Wed, Mar 22, 2017 at 8:11 PM, vinayak 
> wrote:

On 2017/03/21 21:25, Haribabu Kommi wrote:

On Tue, Mar 21, 2017 at 3:41 PM, vinayak
> wrote:

Thank you for testing the patch on Windows platform.

Thanks for the updated patch.

It works good for a normal relation.  But for a relation that
contains child tables,
the PROGRESS_ANALYZE_NUM_ROWS_SAMPLED produces wrong results.

Thank you for reviewing the patch.
The attached patch implements a way to report sample rows count
from acquire_sample_rows() even if called for child tables.

How about adding another phase called
PROGRESS_ANALYZE_PHASE_COLLECT_INHERIT_SAMPLE_ROWS
and set this phase only when it is an inheritance analyze
operation. And adding
some explanation of ROWS_SAMPLED phase about inheritance tables
how these sampled rows are calculated will provide good analyze
progress of
relation that contains child relations also.

I have added the phase called
PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS.
I have also updated the documentation.

Thanks for the updated patch. I will check it.

The ANALYZE command takes long time in computing statistics
phase.So I think we can add some column or phase so that user can
easily understand the progress.
How about adding new column like "num_rows_processed" will compute
the statistics of specified column?

I prefer a column with rows processed instead of a phase.
Because we already set the phase of compute stats and showing
the progress there will number of rows processed will be good.

How about separate the computing "inheritance statistics" phase
from the computing regular "single table" statistics.
Comment?

Yes, this will be good to show both that states of inheritance of 
sampled rows and
compute inheritance stats, so that it will be clearly visible to the 
user the current

status.

I have updated the patch.
I have added column "num_rows_processed" and phase "computing inherited 
statistics" in the view.

=# \d+ pg_stat_progress_analyze
 View "pg_catalog.pg_stat_progress_analyze"
 Column |  Type   | Collation | Nullable | Default | 
Storage  | Description

+-+---+--+-+--+-
 pid| integer |   |  | | 
plain|
 datid  | oid |   |  | | 
plain|
 datname| name|   |  | | 
plain|
 relid  | oid |   |  | | 
plain|
 phase  | text|   |  | | 
extended |
 num_target_sample_rows | bigint  |   |  | | 
plain|
 num_rows_sampled   | bigint  |   |  | | 
plain|
 num_rows_processed | bigint  |   |  | | 
plain|

View definition:
 SELECT s.pid,
s.datid,
d.datname,
s.relid,
CASE s.param1
WHEN 0 THEN 'initializing'::text
WHEN 1 THEN 'collecting sample rows'::text
WHEN 2 THEN 'collecting inherited sample rows'::text
WHEN 3 THEN 'computing statistics'::text
WHEN 4 THEN 'computing inherited statistics'::text
ELSE NULL::text
END AS phase,
s.param2 AS num_target_sample_rows,
s.param3 AS num_rows_sampled,
s.param4 AS num_rows_processed
   FROM pg_stat_get_progress_info('ANALYZE'::text) s(pid, datid, relid, 
param1, param2, param3, param4, param5, param6, param7, param8, param9, 
param10)

 LEFT JOIN pg_database d ON s.datid = d.oid;

Regards,
Vinayak Pokale
NTT Open Source Software Center

pg_stat_progress_analyze_v6.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical decoding on standby

2017-03-24 Thread Craig Ringer

On 23 March 2017 at 17:44, Craig Ringer  wrote:

Minor update to catalog_xmin walsender patch to fix failure to
parenthesize definition of PROCARRAY_PROC_FLAGS_MASK .

This one's ready to go. Working on drop slots on DB drop now.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From b5e34ecaa8f43825fe41ae2e2bbf0a97258cb56a Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 22 Mar 2017 12:29:13 +0800
Subject: [PATCH] Report catalog_xmin separately to xmin in hot standby
 feedback

The catalog_xmin of slots on a standby was reported as part of the standby's
xmin, causing the master's xmin to be held down. This could cause considerable
unnecessary bloat on the master.

Instead, report catalog_xmin as a separate field in hot_standby_feedback. If
the upstream walsender is using a physical replication slot, store the
catalog_xmin in the slot's catalog_xmin field. If the upstream doesn't use a
slot and has only a PGPROC entry behaviour doesn't change, as we store the
combined xmin and catalog_xmin in the PGPROC entry.

There's no backward compatibility concern here, as nothing except another
postgres instance of the same major version has any business sending hot
standby feedback and it's only used on the physical replication protocol.
---
 doc/src/sgml/protocol.sgml |  33 ++-
 src/backend/replication/walreceiver.c  |  45 +++--
 src/backend/replication/walsender.c| 110 +++--
 src/backend/storage/ipc/procarray.c|  12 ++-
 src/include/storage/proc.h |   5 +
 src/include/storage/procarray.h|  11 +++
 .../recovery/t/010_logical_decoding_timelines.pl   |  38 ++-
 7 files changed, 202 insertions(+), 52 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 48ca414..b3a5026 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1916,10 +1916,11 @@ The commands accepted in walsender mode are:
   
   
   
-  The standby's current xmin. This may be 0, if the standby is
-  sending notification that Hot Standby feedback will no longer
-  be sent on this connection. Later non-zero messages may
-  reinitiate the feedback mechanism.
+  The standby's current global xmin, excluding the catalog_xmin from any
+  replication slots. If both this value and the following
+  catalog_xmin are 0 this is treated as a notification that Hot Standby
+  feedback will no longer be sent on this connection. Later non-zero
+  messages may reinitiate the feedback mechanism.
   
   
   
@@ -1929,7 +1930,29 @@ The commands accepted in walsender mode are:
   
   
   
-  The standby's current epoch.
+  The epoch of the global xmin xid on the standby.
+  
+  
+  
+  
+  
+  Int32
+  
+  
+  
+  The lowest catalog_xmin of any replication slots on the standby. Set to 0
+  if no catalog_xmin exists on the standby or if hot standby feedback is being
+  disabled.
+  
+  
+  
+  
+  
+  Int32
+  
+  
+  
+  The epoch of the catalog_xmin xid on the standby.
   
   
   
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 31c567b..0f22f17 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1175,8 +1175,8 @@ XLogWalRcvSendHSFeedback(bool immed)
 {
 	TimestampTz now;
 	TransactionId nextXid;
-	uint32		nextEpoch;
-	TransactionId xmin;
+	uint32		xmin_epoch, catalog_xmin_epoch;
+	TransactionId xmin, catalog_xmin;
 	static TimestampTz sendTime = 0;
 	/* initially true so we always send at least one feedback message */
 	static bool master_has_standby_xmin = true;
@@ -1221,29 +1221,56 @@ XLogWalRcvSendHSFeedback(bool immed)
 	 * everything else has been checked.
 	 */
 	if (hot_standby_feedback)
-		xmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_DEFAULT);
+	{
+		TransactionId slot_xmin;
+
+		/*
+		 * Usually GetOldestXmin() would include the global replication slot
+		 * xmin and catalog_xmin in its calculations, but we don't want to hold
+		 * upstream back from vacuuming normal user table tuples just because
+		 * they're within the catalog_xmin horizon of logical replication slots
+		 * on this standby, so we ignore slot xmin and catalog_xmin GetOldestXmin
+		 * then deal with them ourselves.
+		 */
+		xmin = GetOldestXmin(NULL,
+			 PROCARRAY_FLAGS_DEFAULT|PROCARRAY_SLOTS_XMIN);
+
+		ProcArrayGetReplicationSlotXmin(_xmin, _xmin);
+
+		if (TransactionIdIsValid(slot_xmin) &&
+			TransactionIdPrecedes(slot_xmin, xmin))
+			xmin = slot_xmin;
+	}
 	else
+	{
 		xmin = InvalidTransactionId;
+		catalog_xmin =

Re: [HACKERS] segfault in hot standby for hash indexes

2017-03-24 Thread Ashutosh Sharma

>>> > I think this will work, but not sure if there is a merit to deviate
>>> > from what btree does to handle this case.   One thing I find slightly
>>> > awkward in hash_xlog_vacuum_get_latestRemovedXid() is that you are
>>> > using a number of tuples registered as part of fixed data
>>> > (xl_hash_vacuum_one_page) to traverse the data registered as buf data.
>>> > I think it will be better if we register offsets also in fixed part of
>>> > data as we are doing btree case.
>>
>> Agreed. I have made the changes accordingly. Please check attached v2 patch.
>>
>
> Changes look good to me.   I think you can modify the comments in
> structure xl_hash_vacuum_one_page to mention "TARGET OFFSET NUMBERS
> FOLLOW AT THE END"
>

Added the comment in xl_hash_vacuum_one_page structure.

>>>
>>> >
>>> >
>>>
>>> Also another small point in this regard, do we need two separate
>>> variables to track number of deleted items in below code?  I think one
>>> variable is sufficient.
>>>
>>> _hash_vacuum_one_page()
>>> {
>>> ..
>>> deletable[ndeletable++] = offnum;
>>> tuples_removed += 1;--
>>>
>>
>> Yes, I think 'ndeletable' alone should be fine.
>>
>
> I think it would have been probably okay to use *int* for ntuples as
> that matches with what you are actually assigning in the function.

okay, corrected it. Attached is newer version of patch.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index da4c2c5..2ccaf46 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -966,8 +966,6 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 	OffsetNumber	hoffnum;
 	TransactionId	latestRemovedXid = InvalidTransactionId;
 	int		i;
-	char *ptr;
-	Size len;
 
 	xlrec = (xl_hash_vacuum_one_page *) XLogRecGetData(record);
 
@@ -986,12 +984,20 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 		return latestRemovedXid;
 
 	/*
+	 * Check if WAL replay has reached a consistent database state. If not,
+	 * we must PANIC. See the definition of btree_xlog_delete_get_latestRemovedXid
+	 * for more details.
+	 */
+	if (!reachedConsistency)
+		elog(PANIC, "hash_xlog_vacuum_get_latestRemovedXid: cannot operate with inconsistent data");
+
+	/*
 	 * Get index page.  If the DB is consistent, this should not fail, nor
 	 * should any of the heap page fetches below.  If one does, we return
 	 * InvalidTransactionId to cancel all HS transactions.  That's probably
 	 * overkill, but it's safe, and certainly better than panicking here.
 	 */
-	XLogRecGetBlockTag(record, 1, , NULL, );
+	XLogRecGetBlockTag(record, 0, , NULL, );
 	ibuffer = XLogReadBufferExtended(rnode, MAIN_FORKNUM, blkno, RBM_NORMAL);
 
 	if (!BufferIsValid(ibuffer))
@@ -1003,9 +1009,7 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 	 * Loop through the deleted index items to obtain the TransactionId from
 	 * the heap items they point to.
 	 */
-	ptr = XLogRecGetBlockData(record, 1, );
-
-	unused = (OffsetNumber *) ptr;
+	unused = (OffsetNumber *) ((char *) xlrec + SizeOfHashVacuumOnePage);
 
 	for (i = 0; i < xlrec->ntuples; i++)
 	{
@@ -1130,23 +1134,15 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
 
 	if (action == BLK_NEEDS_REDO)
 	{
-		char *ptr;
-		Size len;
-
-		ptr = XLogRecGetBlockData(record, 0, );
-
 		page = (Page) BufferGetPage(buffer);
 
-		if (len > 0)
+		if (XLogRecGetDataLen(record) > SizeOfHashVacuumOnePage)
 		{
 			OffsetNumber *unused;
-			OffsetNumber *unend;
 
-			unused = (OffsetNumber *) ptr;
-			unend = (OffsetNumber *) ((char *) ptr + len);
+			unused = (OffsetNumber *) ((char *) xldata + SizeOfHashVacuumOnePage);
 
-			if ((unend - unused) > 0)
-PageIndexMultiDelete(page, unused, unend - unused);
+			PageIndexMultiDelete(page, unused, xldata->ntuples);
 		}
 
 		/*
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 8640e85..8699b5b 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -344,7 +344,6 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 	Page	page = BufferGetPage(buf);
 	HashPageOpaque	pageopaque;
 	HashMetaPage	metap;
-	double tuples_removed = 0;
 
 	/* Scan each tuple in page to see if it is marked as LP_DEAD */
 	maxoff = PageGetMaxOffsetNumber(page);
@@ -355,10 +354,7 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 		ItemId	itemId = PageGetItemId(page, offnum);
 
 		if (ItemIdIsDead(itemId))
-		{
 			deletable[ndeletable++] = offnum;
-			tuples_removed += 1;
-		}
 	}
 
 	if (ndeletable > 0)
@@ -386,7 +382,7 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 		pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES;
 
 		metap = HashPageGetMeta(BufferGetPage(metabuf));
-		metap->hashm_ntuples -= tuples_removed;
+		metap->hashm_ntuples -= ndeletable;
 
 		MarkBufferDirty(buf);

Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Kuntal Ghosh

On Fri, Mar 24, 2017 at 12:35 PM, Craig Ringer  wrote:
> On 24 March 2017 at 14:07, Kuntal Ghosh  wrote:
>> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>>  wrote:
>>> Hello,
>>> In Windows, if one needs to take a dump in plain text format (this is
>>> the default option, or can be specified using -Fp) with some level of
>>> compression (-Z[0-9]), an output file has to
>>> be specified. Otherwise, if the output is redirected to stdout, it'll
>>> create a corrupted dump (cmd is set to ASCII mode, so it'll put
>>> carriage returns in the file).
>> To reproduce the issue, please use the following command in windows cmd:
>>
>> pg_dump -Z 9 test > E:\test_xu.backup
>> pg_dump -Fp -Z 9 test > E:\test_xu.backup
>
> This is a known problem. It is not specific to PostgreSQL, it affects
> any software that attempts to use stdin/stdout on Windows via cmd,
> where it is not 8-bit clean.
>
> We don't just refuse to run with stdout as a destination because it's
> perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
> as I know, tell whether it's being invoked by cmd or something else.
ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
mode, cmd pokes its nose and does CR / LF conversions on its own. So,
whenever we want compression on a plain-text dump file, we can set the
stdout mode to O_BINARY. Is it a wrong approach?

> If you have concrete ideas on how to improve this they'd be welcomed.
> Is there anywhere you expected to find info in the docs? Do you know
> of a way to detect in Windows if the output stream is not 8-bit clean
> from within the application program? ... other?
Actually, I'm not that familiar with windows environment. But, I
couldn't find any note to user in pg_dump documentation regarding the
issue. In cmd, if someone needs a plain-text dump in compressed
format, they should specify the output file, otherwise they may run
into the above problem. However, if a dump is corrupted due to the
above issue, a fix for that is provided in [1]. Should we include this
in the documentation?



[1] http://www.gzip.org/
Use fixgz.c to remove the extra CR (carriage return) bytes.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Kuntal Ghosh

On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
 wrote:
> Hello,
> In Windows, if one needs to take a dump in plain text format (this is
> the default option, or can be specified using -Fp) with some level of
> compression (-Z[0-9]), an output file has to
> be specified. Otherwise, if the output is redirected to stdout, it'll
> create a corrupted dump (cmd is set to ASCII mode, so it'll put
> carriage returns in the file).
To reproduce the issue, please use the following command in windows cmd:

pg_dump -Z 9 test > E:\test_xu.backup
pg_dump -Fp -Z 9 test > E:\test_xu.backup


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] create_unique_path and GEQO

2017-03-24 Thread Ashutosh Bapat

On Fri, Mar 24, 2017 at 11:52 AM, Rushabh Lathia
 wrote:
>
>
> On Wed, Mar 22, 2017 at 3:43 PM, Ashutosh Bapat
>  wrote:
>>
>> Hi,
>> In create_unique_path() there's comment
>> /*
>>  * We must ensure path struct and subsidiary data are allocated in
>> main
>>  * planning context; otherwise GEQO memory management causes trouble.
>>  */
>> oldcontext = MemoryContextSwitchTo(root->planner_cxt);
>>
>> pathnode = makeNode(UniquePath);
>>
>> This means that when GEQO resets the memory context, the RelOptInfo
>> for which this path is created and may be set to cheapest_unique_path
>> goes away, the unique path lingers on in the planner context.
>> Shouldn't we instead allocate the path in the same context as the
>> RelOptInfo similar to mark_dummy_rel()?
>>
>
> Do you have test case, which can reproduce the issue you
> explained above?
>

No. It would require some surgery in standard_planner() to measure the
memory consumed in the planner context OR build the code with
SHOW_MEMORY_STATS defined and dump memory context statistics and check
planner context memory usage. I don't think I can produce a testcase
quickly right now. But then, I think the problem is quite apparent
from the code inspection alone, esp. comparing what mark_dummy_rel()
does with what create_unique_path() is doing.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] error handling in RegisterBackgroundWorker

2017-03-24 Thread Michael Paquier

On Fri, Mar 24, 2017 at 1:20 PM, Peter Eisentraut
 wrote:
> On 2/15/17 12:11, Robert Haas wrote:
>> On Wed, Feb 15, 2017 at 11:30 AM, Peter Eisentraut
>>  wrote:
>>> If RegisterBackgroundWorker() (the non-dynamic kind that is only
>>> loadable from shared_preload_libraries) fails to register the worker, it
>>> writes a log message and proceeds, ignoring the registration request.  I
>>> think that is a mistake, it should be a hard error.  The only way in
>>> practice to fix the problem is to change shared_preload_libraries or
>>> max_worker_processes, both requiring a restart anyway, so proceeding
>>> without the worker is not useful.
>>
>> I guess the question is whether people will prefer to have the
>> database start up and be missing the worker, or to have it not start.
>> As you point out, the former is likely to result in an eventual
>> restart, but the latter may lead to a longer period of downtime RIGHT
>> NOW.  People tend to really hate things that make the database not
>> start, so I'm not sure what's best here.
>
> Any other thoughts on this?  Seems like a potential usability issue.

What if we just let the user choose what they want with a new switch
in bgw_flags, but keep LOG the default? One behavior and the other
look both sensible to me.

@@ -824,7 +824,8 @@ RegisterBackgroundWorker(BackgroundWorker *worker)
   "Up to %d background workers can be
registered with the current settings.",
   max_worker_processes,
   max_worker_processes),
- errhint("Consider increasing the configuration
parameter \"max_worker_processes\".")));
+ errhint("Consider increasing the configuration
parameter \"max_worker_processes\"."),
+ errcontext("registration of background worker
\"%s\"", worker->bgw_name)));
 return;
 }
No issues with this bit in 0001.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-03-24 Thread Rajkumar Raghuwanshi

On Fri, Mar 24, 2017 at 12:38 AM, Amit Khandekar  wrote:
> Meanwhile, attached is a WIP patch v10. The only change in this patch
> w.r.t. the last patch (v9) is that this one has a new function defined
> append_nonpartial_cost(). Just sending this to show how the algorithm
> looks like; haven't yet called it.
>

Hi,

I have given patch on latest pg sources (on commit
457a4448732881b5008f7a3bcca76fc299075ac3). configure and make all
install ran successfully, but initdb failed with below error.

[edb@localhost bin]$ ./initdb -D data
The files belonging to this database system will be owned by user "edb".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory data ... ok
creating subdirectories ... ok
selecting default max_connections ... sh: line 1:  3106 Aborted
 (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=100 -c shared_buffers=1000 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3112 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=50 -c shared_buffers=500 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3115 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=40 -c shared_buffers=400 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3118 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=30 -c shared_buffers=300 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3121 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=20 -c shared_buffers=200 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3124 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=10 -c shared_buffers=100 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
10
selecting default shared_buffers ... sh: line 1:  3127 Aborted
(core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=10 -c shared_buffers=16384 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
400kB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... TRAP:
FailedAssertion("!(LWLockTranchesAllocated >=
LWTRANCHE_FIRST_USER_DEFINED)", File: "lwlock.c", Line: 501)
child process was terminated by signal 6: Aborted
initdb: removing data directory "data"

[edb@localhost bin]$

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Mark Kirkwood


On 24/03/17 12:32, Petr Jelinek wrote:


On 24/03/17 00:14, Mark Kirkwood wrote:

On 24/03/17 02:00, Peter Eisentraut wrote:

On 3/21/17 21:38, Peter Eisentraut wrote:

This patch is looking pretty good to me, modulo the failing pg_dump
tests.

Attached is a fixup patch.  I have mainly updated some comments and
variable naming for (my) clarity.  No functional changes.

Committed all that.


Testing now this patch is in, I'm unable to create a subscription:

(master)

bench=# CREATE PUBLICATION pgbench
 FOR TABLE pgbench_accounts , pgbench_branches,
   pgbench_tellers, pgbench_history
 WITH (PUBLISH INSERT, PUBLISH UPDATE, PUBLISH DELETE);

(slave)

bench=# CREATE SUBSCRIPTION pgbench
 CONNECTION 'port=5447 user=postgres dbname=bench'
 PUBLICATION pgbench
 WITH (COPY DATA);
ERROR:  duplicate key value violates unique constraint
"pg_subscription_rel_srrelid_srsubid_index"
DETAIL:  Key (srrelid, srsubid)=(0, 16389) already exists.

This is a pair of freshly initdb'ed instances, the master has a size 100
pgbench schema.

I'm guessing this is a different bug from the segfault also reported


Yes, I also forgot to check if the table actually exists on subscriber
when fetching them in CREATE SUBSCRIPTION (we have check during
replication but not there).

Attached patches should fix both issues.




Yep, does seem to.

I note that (probably intensional) specifying 'COPY DATA' does not 
'CREATE TABLES' for you...ok, I probably didn't read ...something 
somewhere...but anyway, after creating the tables it all seems to work. 
Nice.


However one minor observation - as Michael Banck noted - the elapsed 
time for slave to catch up after running:


$ pgbench -c8 -T600 bench

on the master was (subjectively) much longer than for physical streaming 
replication. Is this expected?


regards

Mark




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-03-24 Thread Rahila Syed

Hello Rushabh,

Thank you for reviewing.
Have addressed all your comments in the attached patch. The attached patch
currently throws an
error if a new partition is added after default partition.

>Rather then adding check for default here, I think this should be handle
inside
>get_qual_for_list().
Have moved the check inside get_qual_for_partbound() as needed to do some
operations
before calling get_qual_for_list() for default partitions.

Thank you,
Rahila Syed

On Tue, Mar 21, 2017 at 11:36 AM, Rushabh Lathia 
wrote:

> I picked this for review and noticed that patch is not getting
> cleanly complied on my environment.
>
> partition.c: In function ‘RelationBuildPartitionDesc’:
> partition.c:269:6: error: ISO C90 forbids mixed declarations and code
> [-Werror=declaration-after-statement]
>   Const*val = lfirst(c);
>   ^
> partition.c: In function ‘generate_partition_qual’:
> partition.c:1590:2: error: ISO C90 forbids mixed declarations and code
> [-Werror=declaration-after-statement]
>   PartitionBoundSpec *spec = (PartitionBoundSpec *)bound;
>   ^
> partition.c: In function ‘get_partition_for_tuple’:
> partition.c:1820:5: error: array subscript has type ‘char’
> [-Werror=char-subscripts]
>  result = parent->indexes[partdesc->boundinfo->def_index];
>  ^
> partition.c:1824:16: error: assignment makes pointer from integer without
> a cast [-Werror]
>  *failed_at = RelationGetRelid(parent->reldesc);
> ^
> cc1: all warnings being treated as errors
>
> Apart from this, I was reading patch here are few more comments:
>
> 1) Variable initializing happening at two place.
>
> @@ -166,6 +170,8 @@ RelationBuildPartitionDesc(Relation rel)
>  /* List partitioning specific */
>  PartitionListValue **all_values = NULL;
>  boolfound_null = false;
> +boolfound_def = false;
> +intdef_index = -1;
>  intnull_index = -1;
>
>  /* Range partitioning specific */
> @@ -239,6 +245,8 @@ RelationBuildPartitionDesc(Relation rel)
>  i = 0;
>  found_null = false;
>  null_index = -1;
> +found_def = false;
> +def_index = -1;
>  foreach(cell, boundspecs)
>  {
>  ListCell   *c;
> @@ -249,6 +257,15 @@ RelationBuildPartitionDesc(Relation rel)
>
>
> 2)
>
> @@ -1558,6 +1586,19 @@ generate_partition_qual(Relation rel)
>  bound = stringToNode(TextDatumGetCString(boundDatum));
>  ReleaseSysCache(tuple);
>
> +/* Return if it is a default list partition */
> +PartitionBoundSpec *spec = (PartitionBoundSpec *)bound;
> +ListCell *cell;
> +foreach(cell, spec->listdatums)
>
> More comment on above hunk is needed?
>
> Rather then adding check for default here, I think this should be handle
> inside
> get_qual_for_list().
>
> 3) Code is not aligned with existing
>
> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index 6316688..ebb7db7 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -2594,6 +2594,7 @@ partbound_datum:
>  Sconst{ $$ = makeStringConst($1, @1); }
>  | NumericOnly{ $$ = makeAConst($1, @1); }
>  | NULL_P{ $$ = makeNullAConst(@1); }
> +| DEFAULT  { $$ = (Node *)makeDefElem("DEFAULT", NULL, @1); }
>  ;
>
>
> 4) Unnecessary hunk:
>
> @@ -2601,7 +2602,6 @@ partbound_datum_list:
>  | partbound_datum_list ',' partbound_datum
>  { $$ = lappend($1, $3); }
>  ;
> -
>
> Note: this is just an initially review comments, I am yet to do the
> detailed review
> and the testing for the patch.
>
> Thanks.
>
> On Mon, Mar 20, 2017 at 9:27 AM, Rahila Syed 
> wrote:
>
>> Hello,
>>
>> Please find attached a rebased patch with support for pg_dump. I am
>> working on the patch
>> to handle adding a new partition after a default partition by throwing an
>> error if
>> conflicting rows exist in default partition and adding the partition
>> successfully otherwise.
>> Will post an updated patch by tomorrow.
>>
>> Thank you,
>> Rahila Syed
>>
>> On Mon, Mar 13, 2017 at 5:42 AM, Robert Haas 
>> wrote:
>>
>>> On Fri, Mar 10, 2017 at 2:17 PM, Peter Eisentraut
>>>  wrote:
>>> > On 3/2/17 21:40, Robert Haas wrote:
>>> >> On the point mentioned above, I
>>> >> don't think adding a partition should move tuples, necessarily; seems
>>> >> like it would be good enough - maybe better - for it to fail if there
>>> >> are any that would need to be moved.
>>> >
>>> > ISTM that the uses cases of various combinations of adding a default
>>> > partition, adding another partition after it, removing the default
>>> > partition, clearing out the default partition in order to add more
>>> > nondefault partitions, and so on, need to be more clearly spelled out

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Amit Kapila

On Fri, Mar 24, 2017 at 12:25 AM, Pavan Deolasee
 wrote:
>
>
> On Thu, Mar 23, 2017 at 7:53 PM, Amit Kapila 
> wrote:
>>
>> >
>>
>> I am not sure on what basis user can set such parameters, it will be
>> quite difficult to tune those parameters.  I think the point is
>> whatever threshold we keep, once that is crossed, it will perform two
>> scans for all the indexes.
>
>
> Well, that applies to even vacuum parameters, no?
>

I don't know how much we can directly compare the usability of the new
parameters you are proposing here to existing parameters.

> The general sense I've got
> here is that we're ok to push some work in background if it helps the
> real-time queries, and I kinda agree with that.
>

I don't think we can define this work as "some" work, it can be a lot
of work depending on the number of indexes.  Also, I think for some
cases it will generate maintenance work without generating benefit.
For example, when there is one index on a table and there are updates
for that index column.

> Having said that, I see many ways we can improve on this later. Like we can
> track somewhere else information about tuples which may have received WARM
> updates (I think it will need to be a per-index bitmap or so) and use that
> to do WARM chain conversion in a single index pass.
>

Sure, if we have some way to do it in a single pass or does most of
the time in foreground process (like we have some dead marking idea
for indexes), then it would have been better.

> But this is clearly not
> PG 10 material.
>

I don't see much discussion about this aspect of the patch, so not
sure if it is acceptable to increase the cost of vacuum.  Now, I don't
know if your idea of GUC's make it such that the additional cost will
occur seldom and this additional pass has a minimal impact which will
make it acceptable.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix and simplify check for whether we're running as Windows serv

2017-03-24 Thread Heikki Linnakangas

On 03/22/2017 07:44 PM, Robert Haas wrote:

On Wed, Mar 22, 2017 at 10:13 AM, Alvaro Herrera
wrote:

Heikki Linnakangas wrote:

I did some archeology, and found CheckTokenMembership() in MinGW's w32api
packages version 3.14
(https://sourceforge.net/projects/mingw/files/MinGW/Base/w32api/w32api-3.14/,
in include/winbase.h). According to the timestamps on that download page,
that was released in 2009. That was the oldest version I could find, so it
might go even further back.

Dave, do you know exactly what version of MinGW narwhal is running? And how
difficult is it to upgrade to something slightly more modern? Ease of
upgrade is another good data point on how far we need to support old
versions.

Given that this was backpatched and that it broke narwhal in all
branches, I think the solution needs to make narwhal work again without
requiring it to upgrade; so we should acquire CheckTokenMembership via
dynloading just like we do the other functions. If we want to require a
newer mingw version in pg10, that's acceptable, but it should be a
separate patch.

+1 for not moving the minimum system requirements in the back-branches.

Ok. I reverted this patch in the back-branches, and applied the much
less invasive "V2" patch [1] instead. HEAD is unchanged, so narwhal
still fails there.

Dave: the consensus is that we no longer support the old version of
MinGW that narwhal is using, for PostgreSQL v 10. Can you modify the
configuration of narwhal to not try building 'master' anymore, or
upgrade the toolchain, please?

[1]
https://www.postgresql.org/message-id/CAB7nPqSvfu%3DKpJ%3DNX%2BYAHmgAmQdzA7N5h31BjzXeMgczhGCC%2BQ%40mail.gmail.com

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Kuntal Ghosh

On Fri, Mar 24, 2017 at 2:17 PM, Kuntal Ghosh
 wrote:
> On Fri, Mar 24, 2017 at 12:35 PM, Craig Ringer  wrote:
>> On 24 March 2017 at 14:07, Kuntal Ghosh  wrote:
>>> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>>>  wrote:
 Hello,
 In Windows, if one needs to take a dump in plain text format (this is
 the default option, or can be specified using -Fp) with some level of
 compression (-Z[0-9]), an output file has to
 be specified. Otherwise, if the output is redirected to stdout, it'll
 create a corrupted dump (cmd is set to ASCII mode, so it'll put
 carriage returns in the file).
>>> To reproduce the issue, please use the following command in windows cmd:
>>>
>>> pg_dump -Z 9 test > E:\test_xu.backup
>>> pg_dump -Fp -Z 9 test > E:\test_xu.backup
>>
>> This is a known problem. It is not specific to PostgreSQL, it affects
>> any software that attempts to use stdin/stdout on Windows via cmd,
>> where it is not 8-bit clean.
>>
>> We don't just refuse to run with stdout as a destination because it's
>> perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
>> as I know, tell whether it's being invoked by cmd or something else.
> ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
> mode, cmd pokes its nose and does CR / LF conversions on its own. So,
> whenever we want compression on a plain-text dump file, we can set the
> stdout mode to O_BINARY. Is it a wrong approach?
With the help from Ashutosh Sharma, I tested this in Windows
environment. Sadly, it still doesn't work. :( IMHO, we should document
the issue somewhere.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread Stephen Frost

Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> There is a function for that.
[...]
> There is not a function for that, but there could be one.

I'm not sure you've really considered what you're suggesting here.

We need to to make sure we have every file between two LSNs.  Yes, we
could step the LSN forward one byte at a time, calling the appropriate
function for every single byte, to make sure that we have that file, but
that really isn't a reasonable approach.  Nor would it be reasonable if
we go on the assumption that WAL files can't be less than 1MB.

Beyond that, this also bakes in an assumption that we would then require
access to a database (of a current enough version to have the functions
needed too!) to connect to and run these functions, which is a poor
design.  If the user is using a remote system to gather the WAL on, that
system may not have easy access to PG.  Further, backup tools will want
to do things like off-line verification that the backup is complete,
perhaps in another environment entirely which doesn't have PG, or maybe
where what they're trying to do is make sure that a given backup is good
before starting a restore to bring PG back up.

Also, given that one of the things we're talking about here is
specifically that we want to be able to change the WAL size for
different databases, you would have to make sure that the database
you're running these functions on uses the same WAL file size as the one
which is being backed up.

No, I don't agree that we can claim the LSN -> WAL filename mapping is
an internal PG detail that we can whack around because there are
functions to calculate the answer.  External utilities need to be able
to perform that translation and we need to document for them how to do
so correctly.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Aleksander Alekseev

Hi Anastasia,

Thanks a lot for a review!

As was mentioned above there is also a bottleneck in find_all_inheritors
procedure. Turned out the problem there is similar and it could be easily
fixed as well. Corresponding patch is attached to this message. To keep
things in order I'm attaching the previous patch as well.

On Wed, Mar 22, 2017 at 11:53:45AM +, Anastasia Lubennikova wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:not tested
> 
> The patch looks good to me.
> It applies clearly, passes all the tests and eliminates the bottleneck 
> described in the first message.
> And as I can see from the thread, there were no objections from others,
> except a few minor comments about code style, which are fixed in the last 
> version of the patch.
> So I mark it "Ready for committer".
> 
> The new status of this patch is: Ready for Committer
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7cacb1e9b2..a22a5a37c8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -161,6 +161,20 @@ typedef struct TabStatusArray
 static TabStatusArray *pgStatTabList = NULL;
 
 /*
+ * pgStatTabHash entry
+ */
+typedef struct TabStatHashEntry
+{
+	Oid t_id;
+	PgStat_TableStatus* tsa_entry;
+} TabStatHashEntry;
+
+/*
+ * Hash table for O(1) t_id -> tsa_entry lookup
+ */
+static HTAB *pgStatTabHash = NULL;
+
+/*
  * Backends store per-function info that's waiting to be sent to the collector
  * in this hash table (indexed by function OID).
  */
@@ -815,7 +829,13 @@ pgstat_report_stat(bool force)
 pgstat_send_tabstat(this_msg);
 this_msg->m_nentries = 0;
 			}
+
+			/*
+			 * Entry will be zeroed soon so we need to remove it from the lookup table.
+			 */
+			hash_search(pgStatTabHash, >t_id, HASH_REMOVE, NULL);
 		}
+
 		/* zero out TableStatus structs after use */
 		MemSet(tsa->tsa_entries, 0,
 			   tsa->tsa_used * sizeof(PgStat_TableStatus));
@@ -1667,59 +1687,77 @@ pgstat_initstats(Relation rel)
 }
 
 /*
+ * Make sure pgStatTabList and pgStatTabHash are initialized.
+ */
+static void
+make_sure_stat_tab_initialized()
+{
+	HASHCTL ctl;
+
+	if (pgStatTabList)
+		return;
+
+	Assert(!pgStatTabHash);
+
+	memset(, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);
+	ctl.entrysize = sizeof(TabStatHashEntry);
+	ctl.hcxt = TopMemoryContext;
+
+	pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup table",
+		TABSTAT_QUANTUM, , HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	pgStatTabList = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+sizeof(TabStatusArray));
+}
+
+/*
  * get_tabstat_entry - find or create a PgStat_TableStatus entry for rel
  */
 static PgStat_TableStatus *
 get_tabstat_entry(Oid rel_id, bool isshared)
 {
+	TabStatHashEntry* hash_entry;
 	PgStat_TableStatus *entry;
 	TabStatusArray *tsa;
-	TabStatusArray *prev_tsa;
-	int			i;
+	bool found;
+
+	make_sure_stat_tab_initialized();
+
+	/*
+	 * Find an entry or create a new one.
+	 */
+	hash_entry = hash_search(pgStatTabHash, _id, HASH_ENTER, );
+	if(found)
+		return hash_entry->tsa_entry;
 
 	/*
-	 * Search the already-used tabstat slots for this relation.
+	 * `hash_entry` was just created and now we have to fill it.
+	 * First make sure there is a free space in a last element of pgStatTabList.
 	 */
-	prev_tsa = NULL;
-	for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = tsa->tsa_next)
+	tsa = pgStatTabList;
+	while(tsa->tsa_used == TABSTAT_QUANTUM)
 	{
-		for (i = 0; i < tsa->tsa_used; i++)
+		if(tsa->tsa_next == NULL)
 		{
-			entry = >tsa_entries[i];
-			if (entry->t_id == rel_id)
-return entry;
+			tsa->tsa_next = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+		sizeof(TabStatusArray));
 		}
 
-		if (tsa->tsa_used < TABSTAT_QUANTUM)
-		{
-			/*
-			 * It must not be present, but we found a free slot instead. Fine,
-			 * let's use this one.  We assume the entry was already zeroed,
-			 * either at creation or after last use.
-			 */
-			entry = >tsa_entries[tsa->tsa_used++];
-			entry->t_id = rel_id;
-			entry->t_shared = isshared;
-			return entry;
-		}
+		tsa = tsa->tsa_next;
 	}
 
 	/*
-	 * We ran out of tabstat slots, so allocate more.  Be sure they're zeroed.
-	 */
-	tsa = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
-	sizeof(TabStatusArray));
-	if (prev_tsa)
-		prev_tsa->tsa_next = tsa;
-	else
-		pgStatTabList = tsa;
-
-	/*
-	 * Use the first entry of the new TabStatusArray.
+	 * Add an entry.
 	 */
 	entry = >tsa_entries[tsa->tsa_used++];
 	entry->t_id = rel_id;
 	entry->t_shared = isshared;
+

Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread David Steele


On 3/24/17 12:27 AM, Peter Eisentraut wrote:

On 3/23/17 16:58, Stephen Frost wrote:

The backup tools need to also get the LSN from the pg_stop_backup and
verify that they have the WAL file associated with that LSN.


There is a function for that.


They also
need to make sure that they have all of the WAL files between the
starting LSN and the ending LSN.  Doing that requires understanding how
the files are named to make sure there aren't any missing.


There is not a function for that, but there could be one.


A function would be nice, but tools often cannot depend on the database 
being operational so it's still necessary to re-implement them.  Having 
a sane sequence in the WAL makes that easier.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Petr Jelinek

On 24/03/17 11:23, Erik Rijkers wrote:
> On 2017-03-24 10:45, Mark Kirkwood wrote:
>>
>> However one minor observation - as Michael Banck noted - the elapsed
>> time for slave to catch up after running:
>>
>> $ pgbench -c8 -T600 bench
>>
>> on the master was (subjectively) much longer than for physical
>> streaming replication. Is this expected?
>>
> 
> I think you probably want to do (on the slave) :
> 
>   alter role  set synchronous_commit = off;
> 
> otherwise it's indeed extremely slow.
> 

Yes that might be one reason, see
https://www.postgresql.org/message-id/08b7053b-5679-0733-3af7-00b8cde62...@2ndquadrant.com
(although it probably needs rebase now)

Also don't forget that the snapbuild performance fixes haven't been
committed yet so it can take long time to even start.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Monitoring roles patch

2017-03-24 Thread Stephen Frost

Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 3/22/17 09:17, Stephen Frost wrote:
> >> If we do it via GRANTs instead, then users can easily extend it.
> > The intent here is that users will *also* be able to do it via GRANTs if
> > they wish to.
> 
> But why not do it with GRANTs in the first place then?

This is akin to asking why do we need GRANT ALL and ALTER DEFAULT PRIVs.

Would it be technically possible to make users jump through hoops every
time they set up a new system to create their own monitor role that
would then have the right set of privileges for *that* version of PG?
Yes, but it's not exactly friendly.  The whole idea here is that the
tool authors will be able to tell the users that they just need to GRANT
this one role, not a script of 20 GRANT statements, oh, and that it
won't break when doing upgrades.

If we make the users run all the statements individually then they'll
also have to get an updated script for the next version of PG too
because we will have added things that the tools will want access to.

Further, they'll have to make sure to install all the contrib modules
they want to use before running the GRANT script which is provided, or
deal with GRANT'ing the rights out after installing some extension.

With the approach that Dave and I are advocating, we can avoid all of
that.  Contrib modules can bake-in GRANTs to the appropriate roles,
upgrades can be handled smoothly even when we add new capabilities which
are appropriate, users have a simple and straight-forward way to set up
good monitoring, and tool authors will know what permissions are
available and can finally have a better answer than "well, just make the
monior user superuser if you want to avoid all these complexities."

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Erik Rijkers


On 2017-03-24 10:45, Mark Kirkwood wrote:


However one minor observation - as Michael Banck noted - the elapsed
time for slave to catch up after running:

$ pgbench -c8 -T600 bench

on the master was (subjectively) much longer than for physical
streaming replication. Is this expected?



I think you probably want to do (on the slave) :

  alter role  set synchronous_commit = off;

otherwise it's indeed extremely slow.







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Backend crash on non-exclusive backup cancel

2017-03-24 Thread Teodor Sigaev


Thank you all, pushed

Michael Paquier wrote:

On Fri, Mar 24, 2017 at 1:45 AM, Teodor Sigaev  wrote:

I believe patch looks good and it's ready to commit.


Thanks for the review!


As I understand, it fixes bug introduced by
commit 7117685461af50f50c03f43e6a622284c8d54694
Date:   Tue Apr 5 20:03:49 2016 +0200

Implement backup API functions for non-exclusive backups


Indeed.


And, suppose, it should be backpatched to 9.6?


Yes, that's where non-exclusive backups have been introduced.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Pavan Deolasee

On Fri, Mar 24, 2017 at 4:04 PM, Amit Kapila 
wrote:

> On Fri, Mar 24, 2017 at 12:25 AM, Pavan Deolasee
>  wrote:
> >
> >
> > On Thu, Mar 23, 2017 at 7:53 PM, Amit Kapila 
> >
> > The general sense I've got
> > here is that we're ok to push some work in background if it helps the
> > real-time queries, and I kinda agree with that.
> >
>
> I don't think we can define this work as "some" work, it can be a lot
> of work depending on the number of indexes.  Also, I think for some
> cases it will generate maintenance work without generating benefit.
> For example, when there is one index on a table and there are updates
> for that index column.
>
>
That's a fair point. I think we can address this though. At the end of
first index scan we would know how many warm pointers the index has and
whether it's worth doing a second scan. For the case you mentioned, we will
do a second scan just on that one index and skip on all other indexes and
still achieve the same result. On the other hand, if one index receives
many updates and other indexes are rarely updated then we might leave
behind a few WARM chains behind and won't be able to do IOS on those pages.
But given the premise that other indexes are receiving rare updates, it may
not be a problem. Note: the code is not currently written that way, but it
should be a fairly small change.

The other thing that we didn't talk about is that vacuum will need to track
dead tuples and warm candidate chains separately which increases memory
overhead. So for very large tables, and for the same amount of
maintenance_work_mem, one round of vacuum will be able to clean lesser
pages. We can work out more compact representation, but something not done
currently.


>
> > But this is clearly not
> > PG 10 material.
> >
>
> I don't see much discussion about this aspect of the patch, so not
> sure if it is acceptable to increase the cost of vacuum.  Now, I don't
> know if your idea of GUC's make it such that the additional cost will
> occur seldom and this additional pass has a minimal impact which will
> make it acceptable.


Yeah, I agree. I'm trying to schedule some more benchmarks, but any help is
appreciated.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[HACKERS] [psql] patch to fix ordering in words_after_create array

2017-03-24 Thread Rushabh Lathia

Hi All,

While looking at the code around tab-complete.c, I
found the ordering in words_after_create array is not
correct for DEFAULT PRIVILEGES, which been added
under below commit:

commit d7d77f3825122bde55be9e06f6c4851028b99795
Author: Peter Eisentraut 
Date:   Thu Mar 16 18:54:28 2017 -0400

psql: Add completion for \help DROP|ALTER

PFA patch to fix the same.

Thanks,
Rushabh Lathia
www.EnterpriseDB.com
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f749406..02a1571 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -999,8 +999,8 @@ static const pgsql_thing_t words_after_create[] = {
 	{"CONFIGURATION", Query_for_list_of_ts_configurations, NULL, THING_NO_SHOW},
 	{"CONVERSION", "SELECT pg_catalog.quote_ident(conname) FROM pg_catalog.pg_conversion WHERE substring(pg_catalog.quote_ident(conname),1,%d)='%s'"},
 	{"DATABASE", Query_for_list_of_databases},
-	{"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW},
 	{"DEFAULT PRIVILEGES", NULL, NULL, THING_NO_CREATE | THING_NO_DROP},
+	{"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW},
 	{"DOMAIN", NULL, _for_list_of_domains},
 	{"EVENT TRIGGER", NULL, NULL},
 	{"EXTENSION", Query_for_list_of_extensions},

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Fix and simplify check for whether we're running as Windows serv

2017-03-24 Thread Dave Page

On Friday, March 24, 2017, Heikki Linnakangas  wrote:

> On 03/22/2017 07:44 PM, Robert Haas wrote:
>
>> On Wed, Mar 22, 2017 at 10:13 AM, Alvaro Herrera
>>  wrote:
>>
>>> Heikki Linnakangas wrote:
>>>
 I did some archeology, and found CheckTokenMembership() in MinGW's
 w32api
 packages version 3.14
 (https://sourceforge.net/projects/mingw/files/MinGW/Base/
 w32api/w32api-3.14/,
 in include/winbase.h). According to the timestamps on that download
 page,
 that was released in 2009. That was the oldest version I could find, so
 it
 might go even further back.

 Dave, do you know exactly what version of MinGW narwhal is running? And
 how
 difficult is it to upgrade to something slightly more modern? Ease of
 upgrade is another good data point on how far we need to support old
 versions.

>>>
>>> Given that this was backpatched and that it broke narwhal in all
>>> branches, I think the solution needs to make narwhal work again without
>>> requiring it to upgrade; so we should acquire CheckTokenMembership via
>>> dynloading just like we do the other functions.  If we want to require a
>>> newer mingw version in pg10, that's acceptable, but it should be a
>>> separate patch.
>>>
>>
>> +1 for not moving the minimum system requirements in the back-branches.
>>
>
> Ok. I reverted this patch in the back-branches, and applied the much less
> invasive "V2" patch [1] instead. HEAD is unchanged, so narwhal still fails
> there.
>
> Dave: the consensus is that we no longer support the old version of MinGW
> that narwhal is using, for PostgreSQL v 10. Can you modify the
> configuration of narwhal to not try building 'master' anymore, or upgrade
> the toolchain, please?
>

I've disabled it. Thanks.


-- 
Dave Page
PostgreSQL Core Team
http://www.postgresql.org/

Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-24 Thread Aleksander Alekseev

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

This patch looks good to me. As I understand we have both a complete feature 
and a consensus in a thread here. If there are no objection, I'm marking this 
patch as "Ready for Commiter".

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] scram and \password

2017-03-24 Thread Heikki Linnakangas


On 03/24/2017 03:02 PM, Michael Paquier wrote:

In order to close this thread, I propose to reuse the patches I sent
here to make scram_build_verifier() available to frontends:
https://www.postgresql.org/message-id/CAB7nPqT4yc3u8wspYkWbG088Ndp6asMH3=zb___ck89ctvz...@mail.gmail.com

And on top of it modify \password so as it generates a md5 verifier
for pre-9.6 servers and a scram one for post-10 servers by looking at
the backend version of the current connection. What do you think?


Yep, sounds like a plan.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread David Steele


On 3/23/17 4:45 PM, Peter Eisentraut wrote:

On 3/22/17 17:33, David Steele wrote:

I think if we don't change the default size it's very unlikely I would
use alternate WAL segment sizes or recommend that anyone else does, at
least in v10.

I simply don't think it would get the level of testing required to be
production worthy


I think we could tweak the test harnesses to run all the tests with
different segment sizes.  That would get pretty good coverage.


I would want to see 1,16,64 at a minimum.  More might be nice but that 
gets a bit ridiculous at some point.  I would be fine with different 
critters having different defaults.  I don't think that each critter 
needs to test each value.



More generally, the methodology that we should not add an option unless
we also change the default because the option would otherwise not get
enough testing is a bit dubious.


Generally, I would agree, but I think this is a special case.  This 
option has been around for a long time and we are just now exposing it 
in a way that's useful to end users.  It's easy to see how various 
assumptions may have arisen around the default and led to code that is 
not quite right when using different values.  Even if that's not true in 
the backend code, it might affect bin, and certainly affects third party 
tools.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Peter Eisentraut

On 3/23/17 19:32, Petr Jelinek wrote:
> Yes, I also forgot to check if the table actually exists on subscriber
> when fetching them in CREATE SUBSCRIPTION (we have check during
> replication but not there).

I think for this we can probably just change the missing_ok argument of
RangeVarGetRelid() to false.

Unless we want the custom error message, in which case we need to change
AlterSubscription_refresh(), because right now it errors out because
missing_ok = false.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Petr Jelinek

On 24/03/17 15:05, Peter Eisentraut wrote:
> On 3/23/17 19:32, Petr Jelinek wrote:
>> Yes, I also forgot to check if the table actually exists on subscriber
>> when fetching them in CREATE SUBSCRIPTION (we have check during
>> replication but not there).
> 
> I think for this we can probably just change the missing_ok argument of
> RangeVarGetRelid() to false.
> 
> Unless we want the custom error message, in which case we need to change
> AlterSubscription_refresh(), because right now it errors out because
> missing_ok = false.
> 

You are right, stupid me.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 81c7750095cfac35fad9a841309b2c5501e59a62 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Fri, 24 Mar 2017 00:24:47 +0100
Subject: [PATCH 2/2] Check that published table exists on subscriber

---
 src/backend/commands/subscriptioncmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8f201b2..efe70b4 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -416,7 +416,7 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 RangeVar   *rv = (RangeVar *) lfirst(lc);
 Oid			relid;
 
-relid = RangeVarGetRelid(rv, AccessShareLock, true);
+relid = RangeVarGetRelid(rv, AccessShareLock, false);
 
 SetSubscriptionRelState(subid, relid, table_state,
 		InvalidXLogRecPtr);
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling parallelism for queries coming from SQL or other PL functions

2017-03-24 Thread Rafia Sabih

On Thu, Mar 23, 2017 at 11:26 PM, Robert Haas  wrote:

>
> The changes to the plpgsql code don't look so good to me.  The change
> to exec_stmt_return_query fixes the same bug that I mentioned in the
> email linked above, but only half of it -- it corrects the RETURN
> QUERY EXECUTE case but not the RETURN QUERY case.  And it's clearly a
> separate change; that part is a bug fix, not an enhancement.

My bad. Since, you have given this as a separate patch in the link
upthread, I suppose there's nothing expected from me regarding this
right now.

> Some of
> the other changes depend on whether we're in a trigger, which seems
> irrelevant to whether we can use parallelism. Even if the outer query
> is doing writes, we can still use parallelism for queries inside the
> trigger function if warranted.  It's probably a rare case to have
> queries inside a trigger that are expensive enough to justify such
> handling but I don't see the value of putting in special-case logic to
> prevent it.
>
Fixed. I confused it with not allowing parallel workers when update
command is in progress.

> I suspect that code fails to achieve its goals anyway.  At the top of
> exec_eval_expr(), you call exec_prepare_plan() and unconditionally
> pass CURSOR_OPT_PARALLEL_OK, so when that function returns, expr->plan
> might now be a parallel plan.  If we reach the call to
> exec_run_select() further down in that function, and if we happen to
> pass false, it's not going to matter, because exec_run_select() is
> going to find the plan already initialized.
>
True, fixed.
The attached patch is to be applied over [1].

[1]  
https://www.postgresql.org/message-id/CA%2BTgmoZ_ZuH%2BauEeeWnmtorPsgc_SmP%2BXWbDsJ%2BcWvWBSjNwDQ%40mail.gmail.com
-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


pl_parallel_opt_support_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-24 Thread Amit Langote

Hi Ashutosh,

On 2017/03/23 21:48, Ashutosh Bapat wrote:
>>> I have fixed all the issues reported till now.

I've tried to fix your 0012 patch (Multi-level partitioned table
expansion) considering your message earlier on this thread [1].
Especially the fact that no AppendRelInfo and RelOptInfo are allocated for
partitioned child tables as of commit d3cc37f1d [2].  I've fixed
expand_inherited_rtentry() such that AppendRelInfo *is* allocated for a
partitioned child RTEs whose rte->inh is set to true.  Such an RTE is
recursively expanded with that RTE the parent.

Also as I mentioned elsewhere [3], the multi-level inheritance expansion
of partitioned table will break update/delete for partitioned table, which
is because inheritance_planner() is not ready to handle inheritance sets
structured that way.  I tried to refactor inheritance_planner() such that
its core logic can be recursively invoked for partitioned child RTEs.  The
resulting child paths and other auxiliary information related to planning
across the hierarchy are maintained in one place using a struct to hold
the same in a few flat lists.  The refactoring didn't break any existing
tests and a couple of new tests are added to check that it indeed works
for multi-level partitioned tables expanded using new multi-level structure.

There is some test failure in 0014 (Multi-level partition-wise join
tests), probably because of the changes I made to 0012, which I didn't get
time to check why, although I've checked using an example that multi-level
join planning still works, so it's not completely broken either.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpRefs5ZMnxQ2vP9v5zOtWtNPuiMYc01sb1SWjCOB1CT%3DuQ%40mail.gmail.com

[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d3cc37f1d

[3]
https://www.postgresql.org/message-id/744d20fe-fc7b-f89e-8d06-6496ec537b86%40lab.ntt.co.jp
>From 7051b9cb4760908e23e64969988b58fcb466868e Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Fri, 10 Feb 2017 13:50:14 +0530
Subject: [PATCH 12/14] Multi-level partitioned table expansion.

Construct inheritance hierarchy of a partitioned table to reflect the
partition hierarchy.

Commit d3cc37f1d got rid of AppendRelInfos for *all* partitioned child
child tables.  But we'd need one for recursively expandded partitioned
children.  In its absence, some of the later planning stages would not be
able to access the partitions below the first level, because they are not
directly linked to the root parent.  Fix that by allocating one.  That
means such children also get a RelOptInfo, which also wasn't allocated
previously.

Expanding partitioned table inheritance this new way means we need to
teach a few places how to traverse the hierarchy, such as propagating
lateral join information from the root table down the partition
hierarchy.

Further, inheritance_planner() needs refactoring to enable recursive
invocation of its core logic.  While the query_planner() can handle the
new partition hierarchical structure just fine because of the way the
relevant code works, inheritance_planner() would miss any child tables
below the first level, because it can't do recursion.  Refactor it so
that it can.
---
 src/backend/optimizer/plan/initsplan.c |  14 +-
 src/backend/optimizer/plan/planner.c   | 290 +
 src/backend/optimizer/prep/prepunion.c |  69 ++--
 src/test/regress/expected/inherit.out  |  22 +++
 src/test/regress/sql/inherit.sql   |  17 ++
 5 files changed, 292 insertions(+), 120 deletions(-)

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index b4ac224a7a..20774b29fe 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -630,7 +630,19 @@ create_lateral_join_info(PlannerInfo *root)
 	{
 		RelOptInfo *brel = root->simple_rel_array[rti];

-		if (brel == NULL || brel->reloptkind != RELOPT_BASEREL)
+		if (brel == NULL)
+			continue;
+
+		/*
+		 * If an "other rel" RTE is a "partitioned table", we must propagate
+		 * the lateral info inherited all the way from the root parent to its
+		 * children. That's because the children are not linked directly with
+		 * the root parent via AppendRelInfo's unlike in case of a regular
+		 * inheritance set (see expand_inherited_rtentry()).  Failing to
+		 * do this would result in those children not getting marked with the
+		 * appropriate lateral info.
+		 */
+		if (brel->reloptkind != RELOPT_BASEREL && !brel->part_scheme)
 			continue;

 		if (root->simple_rte_array[rti]->inh)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4487683196..59cb559b9e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -91,10 +91,28 @@ typedef struct
 	List	   *groupClause;	/* overrides parse->groupClause */
 } standard_qp_extra;

+/* Result of a given invocation of

Re: [HACKERS] scram and \password

2017-03-24 Thread Heikki Linnakangas


On 03/23/2017 06:41 AM, Michael Paquier wrote:

And after a lookup the failure is here:
-   result = get_role_password(port->user_name, _pass, logdetail);
+   shadow_pass = get_role_password(port->user_name, logdetail);
if (result == STATUS_OK)
result is never setup in this code path, so that may crash.


Ah, of course. For some reason, I has -Wno-maybe-uninitialized in my 
configure command line. Without that, gcc even warns about that.


Fixed, and pushed. Thanks!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Backend crash on non-exclusive backup cancel

2017-03-24 Thread Michael Paquier

On Fri, Mar 24, 2017 at 7:58 PM, Teodor Sigaev  wrote:
> Thank you all, pushed.

Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Aleksander Alekseev

Also I would like to share some benchmark results.

Before applying any patches (copied from one of previous messages):

```
latency average = 1153.495 ms
latency stddev = 154.366 ms
tps = 6.061104 (including connections establishing)
tps = 6.061211 (excluding connections establishing)
```

After applying first patch (copied as well):

```
latency average = 853.862 ms
latency stddev = 112.270 ms
tps = 8.191861 (including connections establishing)
tps = 8.192028 (excluding connections establishing)
```

After applying both patches:

```
latency average = 533.646 ms
latency stddev = 86.311 ms
tps = 13.110328 (including connections establishing)
tps = 13.110608 (excluding connections establishing)
```

Small amount of partitions (2 to be exact), no patches:

```
latency average = 0.928 ms
latency stddev = 0.296 ms
tps = 7535.224897 (including connections establishing)
tps = 7536.145457 (excluding connections establishing)
```

Small amount of partitions, bot patches applied:

```
latency average = 0.915 ms
latency stddev = 0.269 ms
tps = 7638.344922 (including connections establishing)
tps = 7639.164769 (excluding connections establishing)
```

As you can see these patches don't make things worse in any regard.

Perf shows that both bottlenecks are gone. Before [1] and after [2].

[1] http://afiskon.ru/s/00/2008c4ae66_temp.png
[2] http://afiskon.ru/s/a5/fd81628a3a_temp.png

On Fri, Mar 24, 2017 at 03:17:44PM +0300, Aleksander Alekseev wrote:
> Hi Anastasia,
> 
> Thanks a lot for a review!
> 
> As was mentioned above there is also a bottleneck in find_all_inheritors
> procedure. Turned out the problem there is similar and it could be easily
> fixed as well. Corresponding patch is attached to this message. To keep
> things in order I'm attaching the previous patch as well.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Simon Riggs

On 1 March 2017 at 01:36, Amit Langote  wrote:

> I don't know which way you're thinking of fixing this, but a planner patch
> to implement faster partition-pruning will have taken care of this, I
> think.  As you may know, even declarative partitioned tables currently
> depend on constraint exclusion for partition-pruning and planner's current
> approach of handling inheritance requires to open all the child tables
> (partitions), whereas the new approach hopefully shouldn't need to do
> that.  I am not sure if looking for a more localized fix for this would be
> worthwhile, although I may be wrong.

What "new approach" are we discussing?

Is there a patch or design discussion?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we cacheline align PGXACT?

2017-03-24 Thread Simon Riggs

On 10 March 2017 at 13:08, Alexander Korotkov 
wrote:

Results look good for me.  Idea of committing both of patches looks
> attractive.
>

I'll commit mine since I understand what it does. I'll look at the other
one also, but won't commit yet.


> We have pretty much acceleration for read-only case and small acceleration
> for read-write case.
> I'll run benchmark on 72-cores machine as well.
>

No need to wait for that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread Stephen Frost

Jeff,

* Jeff Janes (jeff.ja...@gmail.com) wrote:
> On Thu, Mar 23, 2017 at 1:45 PM, Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> wrote:
> > On 3/22/17 17:33, David Steele wrote:
> > > and I doubt that most tool writers would be quick to
> > > add support for a feature that very few people (if any) use.
> >
> > I'm not one of those tool writers, although I have written my share of
> > DBA scripts over the years, but I wonder why those tools would really
> > care.  They are handed files with predetermined names to archive, and
> > for restore files with predetermined names are requested back from them.
> >  What else do they need?  If something is missing that requires them to
> > parse file names, then maybe that should be added.
> 
> I have a pg_restore which predicts the file 5 files ahead of the one it was
> asked for, and initiates a pre-fetch and decompression of it. Then it
> delivers the file it was asked for, either by pulling it out of the
> pre-staging area set up by the N-5th invocation, or by going directly to
> the archive to get it.  This speeds up play-back dramatically when the
> files are stored compressed and non-local.

Ah, yes, that is on our road-map for pgBackrest to do also, along with
parallel WAL fetch which also needs to figure out the WAL names before
being asked for them.

We do already have parallel push, which also needs to figure out what
the upcoming file names are going to be so we can find them and push
them when they're indicated as ready in archive_status.  Perhaps we
could just push whatever is ready and remember everything that was
pushed for when PG asks, but that is really not ideal.

> That is why I need to know how the files are numbered.  I don't think that
> that makes much of a difference, though.  Any change is going to break
> that, no matter which change.  Then I'll fix it.

Right, but the discussion here is actually about the idea that we're
going to encourage people to use the initdb-time option to change the
WAL size, meaning you'll need to deal with different WAL sizes and
different naming due to that, and then we're going to turn around in the
very next release and break the naming, meaning you'll have to adjust
your tools first for the different possible WAL sizes in PG10 and then
adjust again for the different naming in PG11.

I'm trying to suggest that if we're going to do this that, perhaps, we
should try to make both the changes in one release instead of across
two.

> If we are going to break it, I'd prefer to just do away with the 'segment'
> thing altogether.  You have timelines, and you have files.  That's it.

I'm not sure I follow this proposal.  We have to know which WAL file has
which LSN in it, how do you do that with just 'timelines and files'?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Suppress Clang 3.9 warnings

2017-03-24 Thread Aleksander Alekseev

Hi Tom,

Since no one seems to be particularly excited about this patch I'm
marking it as "Returned with feedback" to save reviewers time.

On Wed, Mar 15, 2017 at 12:21:21PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Wed, Mar 15, 2017 at 10:57:15AM -0400, Tom Lane wrote:
> >> Seems like the correct solution is to
> >> absorb that fix, either by updating to a newer autoconf release or by
> >> carrying our own version of AC_CHECK_DECLS until they come out with one.
> 
> > As you mention upthread, that Autoconf commit is still newer than every
> > Autoconf release.  (Last release was 58 months ago.)  Altering configure.ac 
> > to
> > work around the bug would be reasonable, but it feels heavy relative to the
> > benefit of suppressing some warnings.
> 
> It does seem like rather a lot of work, but I think it's preferable to
> hacking up the coding in port.h.  Mainly because we could booby-trap the
> substitute AC_CHECK_DECLS to make sure we revert it whenever autoconf 2.70
> does materialize (a check on m4_PACKAGE_VERSION, like the one at
> configure.in line 22, ought to do the trick); whereas I do not think
> we'd remember to de-kluge port.h if we kluge around it there.
> 
> I'm fine with leaving it alone, too.
> 
>   regards, tom lane

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: [HACKERS] scram and \password

2017-03-24 Thread Michael Paquier

On Fri, Mar 24, 2017 at 8:36 PM, Heikki Linnakangas  wrote:
> On 03/23/2017 06:41 AM, Michael Paquier wrote:
>>
>> And after a lookup the failure is here:
>> -   result = get_role_password(port->user_name, _pass, logdetail);
>> +   shadow_pass = get_role_password(port->user_name, logdetail);
>> if (result == STATUS_OK)
>> result is never setup in this code path, so that may crash.
>
> Ah, of course. For some reason, I has -Wno-maybe-uninitialized in my
> configure command line. Without that, gcc even warns about that.
>
> Fixed, and pushed. Thanks!

OK, cool.

In order to close this thread, I propose to reuse the patches I sent
here to make scram_build_verifier() available to frontends:
https://www.postgresql.org/message-id/CAB7nPqT4yc3u8wspYkWbG088Ndp6asMH3=zb___ck89ctvz...@mail.gmail.com

And on top of it modify \password so as it generates a md5 verifier
for pre-9.6 servers and a scram one for post-10 servers by looking at
the backend version of the current connection. What do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Amit Kapila

On Thu, Mar 23, 2017 at 3:54 PM, Pavan Deolasee
 wrote:
> Thanks Amit. v19 addresses some of the comments below.
>
> On Thu, Mar 23, 2017 at 10:28 AM, Amit Kapila 
> wrote:
>>
>> On Wed, Mar 22, 2017 at 4:06 PM, Amit Kapila 
>> wrote:
>> > On Tue, Mar 21, 2017 at 6:47 PM, Pavan Deolasee
>> >  wrote:
>
>>
>> 5.
>> +btrecheck(Relation indexRel, IndexInfo *indexInfo, IndexTuple indexTuple,
>> + Relation heapRel, HeapTuple heapTuple)
>> {
>> ..
>> + if (!datumIsEqual(values[i - 1], indxvalue, att->attbyval,
>> + att->attlen))
>> ..
>> }
>>
>> Will this work if the index is using non-default collation?
>>
>
> Not sure I understand that. Can you please elaborate?
>

I was worried for the case if the index is created non-default
collation, will the datumIsEqual() suffice the need.  Now again
thinking about it, I think it will because in the index tuple we are
storing the value as in heap tuple.  However today it occurred to me
how will this work for toasted index values (index value >
TOAST_INDEX_TARGET).  It is mentioned on top of datumIsEqual() that it
probably won't work for toasted values.  Have you considered that
point?

>>
>> 6.
>> +++ b/src/backend/access/nbtree/nbtxlog.c
>> @@ -390,83 +390,9 @@ btree_xlog_vacuum(XLogReaderState *record)
>> -#ifdef UNUSED
>>   xl_btree_vacuum *xlrec = (xl_btree_vacuum *) XLogRecGetData(record);
>>
>>   /*
>> - * This section of code is thought to be no longer needed, after analysis
>> - * of the calling paths. It is retained to allow the code to be
>> reinstated
>> - * if a flaw is revealed in that thinking.
>> - *
>> ..
>>
>> Why does this patch need to remove the above code under #ifdef UNUSED
>>
>
> Yeah, it isn't strictly necessary. But that dead code was coming in the way
> and hence I decided to strip it out. I can put it back if it's an issue or
> remove that as a separate commit first.
>

I think it is better to keep unrelated changes out of patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] create_unique_path and GEQO

2017-03-24 Thread Tom Lane

Ashutosh Bapat  writes:
>> Do you have test case, which can reproduce the issue you
>> explained above?

> No. It would require some surgery in standard_planner() to measure the
> memory consumed in the planner context OR build the code with
> SHOW_MEMORY_STATS defined and dump memory context statistics and check
> planner context memory usage. I don't think I can produce a testcase
> quickly right now. But then, I think the problem is quite apparent
> from the code inspection alone, esp. comparing what mark_dummy_rel()
> does with what create_unique_path() is doing.

Yeah.  I think the code in mark_dummy_rel is newer and better-thought-out
than what's in create_unique_path.  It probably makes sense to change over.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Monitoring roles patch

2017-03-24 Thread Dave Page

On Fri, Mar 24, 2017 at 4:24 PM, Robert Haas  wrote:
>
>> If we make the users run all the statements individually then they'll
>> also have to get an updated script for the next version of PG too
>> because we will have added things that the tools will want access to.
>
> That's possible, but it really depends on the tool, not on core
> PostgreSQL.  The tool should be the one providing the script, because
> the tool is what knows its own permissions requirements.  Users who
> care about security won't want to grant every privilege given by a
> pg_monitor role to a tool that only needs a subset of those
> privileges.

The upshot of this would be that every time there's a database server
upgrade that changes the permissions required somehow, the user has to
login to every server they have and run a script. It is no longer a
one-time thing, which makes it vastly more painful to deal with
upgrades.

>> With the approach that Dave and I are advocating, we can avoid all of
>> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
>> upgrades can be handled smoothly even when we add new capabilities which
>> are appropriate, users have a simple and straight-forward way to set up
>> good monitoring, and tool authors will know what permissions are
>> available and can finally have a better answer than "well, just make the
>> monior user superuser if you want to avoid all these complexities."
>
> They can have that anyway.  They just have to run a script provided by
> the tool rather than one provided by the core project as a
> one-size-fits-all solution for every tool.

Do you object to having individual default roles specifically for
cases where there are superuser-only checks at the moment that prevent
GRANT? e.g. pg_show_all_settings for SHOW, pg_show_all_stats for
pg_tablespace_size and friends, pg_stat_statements for, well,
pg_stat_statements and so on? It would be an inferior solution in my
opinion, given that it would demonstrably cause users more work, but
at least we could do something.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 155 matches

Mail list logo