Re: Getting rid of some more lseek() calls

2020-02-11 Thread Thomas Munro
On Wed, Feb 12, 2020 at 6:42 PM Michael Paquier  wrote:
> On Tue, Feb 11, 2020 at 06:04:09PM +1300, Thomas Munro wrote:
> > lseek(SEEK_END) seems to be nearly twice as fast as fstat() if you
> > just call it in a big loop, on Linux and FreeBSD (though I didn't
> > investigate exactly why, mitigations etc, it certainly returns more
> > stuff so there's that).
>
> Interesting.  What of Windows?  We've had for some time now problem
> with fetching the size of files larger than 4GB (COPY, dumps..).  I am
> wondering if we could not take advantage of that for those cases:
> https://www.postgresql.org/message-id/15858-9572469fd3b73...@postgresql.org

Hmm.  Well, on Unix we have to choose between "tell me the size but
also change the position that I either don't care about or have to
undo", and "tell me the size but also tell me all this other stuff I
don't care about".  Since Windows apparently has GetFileSizeEx(), why
not use that when that's exactly what you want?  It apparently
understands large files.

https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfilesizeex




Re: WIP: WAL prefetch (another approach)

2020-02-11 Thread Thomas Munro
On Fri, Jan 3, 2020 at 5:57 PM Thomas Munro  wrote:
> On Fri, Jan 3, 2020 at 7:10 AM Tomas Vondra
>  wrote:
> > Could we instead specify the number of blocks to prefetch? We'd probably
> > need to track additional details needed to determine number of blocks to
> > prefetch (essentially LSN for all prefetch requests).

Here is a new WIP version of the patch set that does that.  Changes:

1.  It now uses effective_io_concurrency to control how many
concurrent prefetches to allow.  It's possible that we should have a
different GUC to control "maintenance" users of concurrency I/O as
discussed elsewhere[1], but I'm staying out of that for now; if we
agree to do that for VACUUM etc, we can change it easily here.  Note
that the value is percolated through the ComputeIoConcurrency()
function which I think we should discuss, but again that's off topic,
I just want to use the standard infrastructure here.

2.  You can now change the relevant GUCs (wal_prefetch_distance,
wal_prefetch_fpw, effective_io_concurrency) at runtime and reload for
them to take immediate effect.  For example, you can enable the
feature on a running replica by setting wal_prefetch_distance=8kB
(from the default of -1, which means off), and something like
effective_io_concurrency=10, and telling the postmaster to reload.

3.  The new code is moved out to a new file
src/backend/access/transam/xlogprefetcher.c, to minimise new bloat in
the mighty xlog.c file.  Functions were renamed to make their purpose
clearer, and a lot of comments were added.

4.  The WAL receiver now exposes the current 'write' position via an
atomic value in shared memory, so we don't need to hammer the WAL
receiver's spinlock.

5.  There is some rudimentary user documentation of the GUCs.

[1] https://www.postgresql.org/message-id/13619.1557935593%40sss.pgh.pa.us


0001-Allow-PrefetchBuffer-to-be-called-with-a-SMgrRela-v2.patch
Description: Binary data


0002-Rename-GetWalRcvWriteRecPtr-to-GetWalRcvFlushRecP-v2.patch
Description: Binary data


0003-Add-WalRcvGetWriteRecPtr-new-definition-v2.patch
Description: Binary data


0004-Allow-PrefetchBuffer-to-report-missing-file-in-re-v2.patch
Description: Binary data


0005-Prefetch-referenced-blocks-during-recovery-v2.patch
Description: Binary data


Re: [Proposal] Add accumulated statistics for wait event

2020-02-11 Thread Craig Ringer
On Wed, 12 Feb 2020 at 12:36, imai.yoshik...@fujitsu.com
 wrote:

> It seems performance difference is big in case of read only tests. The reason 
> is that write time is relatively longer than the
> processing time of the logic I added in the patch.

That's going to be a pretty difficult performance hit to justify.

Can we buffer collected wait events locally and spit the buffer to the
stats collector at convenient moments? We can use a limited buffer
size with an overflow flag, so we degrade the results rather than
falling over or forcing excessive stats reporting at inappropriate
times.

I suggest that this is also a good opportunity to add some more
tracepoints to PostgreSQL. The wait events facilities are not very
traceable right now. Exposing some better TRACE_POSTGRESQL_
tracepoints (SDTs) via probes.d would help users collect better
information using external tools like perf, bpftrace and systemtap.
That way we have a zero-overhead-when-unused option that can also be
used to aggregate the information per-query, per-user, etc.

(I really need to add a bunch more tracepoints to make this easier...)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise




Re: Getting rid of some more lseek() calls

2020-02-11 Thread Michael Paquier
On Tue, Feb 11, 2020 at 06:04:09PM +1300, Thomas Munro wrote:
> lseek(SEEK_END) seems to be nearly twice as fast as fstat() if you
> just call it in a big loop, on Linux and FreeBSD (though I didn't
> investigate exactly why, mitigations etc, it certainly returns more
> stuff so there's that).

Interesting.  What of Windows?  We've had for some time now problem
with fetching the size of files larger than 4GB (COPY, dumps..).  I am
wondering if we could not take advantage of that for those cases:
https://www.postgresql.org/message-id/15858-9572469fd3b73...@postgresql.org
--
Michael


signature.asc
Description: PGP signature


Re: client-side fsync() error handling

2020-02-11 Thread Michael Paquier
On Tue, Feb 11, 2020 at 09:22:54AM +0100, Peter Eisentraut wrote:
> Digging around through the call stack, I think changing fsync_fname() to
> just call exit(1) on errors instead of proceeding would address most of
> that.
>
> Thoughts?

Doing things as you do in your patch sounds fine to me for this part.
Now, don't we need to care about durable_rename() and make the
panic-like failure an optional choice?  From what I can see, this
routine is used now in the backend for pg_basebackup to rename
temporary history files or partial WAL segments.
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: walreceiver uses a temporary replication slot by default

2020-02-11 Thread Michael Paquier
On Mon, Feb 10, 2020 at 01:46:04PM -0800, Andres Freund wrote:
> I still architecturally don't find it attractive that the active
> configuration between walreceiver and startup process can diverge
> though. Imagine if we e.g. added the ability to receive WAL over
> multiple connections from one host, or from multiple hosts (e.g. to be
> able to get the bulk of the WAL from a cascading node, but also to
> provide syncrep acknowledgements directly to the primary), or to allow
> for logical replication without needing all WAL locally on a standby
> doing decoding.  It seems not great if there's potentially diverging
> configuration (hot standby feedback, temporary slots, ... ) between
> those walreceivers, just depending on when they started.  Here the model
> e.g. parallel workers use, which explicitly ensure that the GUC state is
> the same in workers and the leader, is considerably better, imo.

Yes, I still think that we should fix that inconsistency, mark the new
GUC wal_receiver_create_temp_slot as PGC_POSTMASTER, and add a note at
the top of RequestXLogStreaming() and walreceiver.c about the
assumptions we'd prefer rely to for the GUCs starting a WAL receiver.

> So I think adding more of these parameters affecting walreceivers
> without coordination is not going quite in the right direction.

Indeed.  Adding more comments would be one way to prevent the
situation to happen here, I fear that others may forget this stuff in
the future.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Masahiko Sawada
On Tue, 11 Feb 2020 at 11:31, Amit Kapila  wrote:
>
> On Wed, Feb 5, 2020 at 12:07 PM Masahiko Sawada  wrote:
> >
> >
> > Unfortunately the environment I used for performance verification is
> > no longer available.
> >
> > I agree to run this test in a different environment. I've attached the
> > rebased version patch. I'm measuring the performance with/without
> > patch, so will share the results.
> >
>
> Did you get a chance to run these tests?  Lately, Mahendra has done a
> lot of performance testing of this patch and shared his results.  I
> don't see much downside with the patch, rather there is a performance
> increase of 3-9% in various scenarios.

I've done performance tests on my laptop while changing the number of
partitions. 4 clients concurrently insert 32 tuples to randomly
selected partitions in a transaction. Therefore by changing the number
of partition the contention of relation extension lock would also be
changed. All tables are unlogged tables and N_RELEXTLOCK_ENTS is 1024.

Here is my test results:

* HEAD
nchilds = 64 tps = 33135
nchilds = 128 tps = 31249
nchilds = 256 tps = 29356

* Patched
nchilds = 64 tps = 32057
nchilds = 128 tps = 32426
nchilds = 256 tps = 29483

The performance has been slightly improved by the patch in two cases.
I've also attached the shell script I used to test.

When I set N_RELEXTLOCK_ENTS to 1 so that all relation locks conflicts
the result is:

nchilds = 64 tps = 30887
nchilds = 128 tps = 30015
nchilds = 256 tps = 27837

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
NDATA=32
NTRY=2
NCLIENTS=4
TIME=120

for childs in 64 128 256
do
pg_ctl stop -mi
rm -r $PGDATA
initdb -E UTF8 --no-locale
cat <> $PGDATA/postgresql.conf
shared_buffers = 512MB
max_wal_size = 20GB
checkpoint_timeout = 1h
EOF
pg_ctl start

psql -c "create unlogged table parent (c int) partition by list(c)"

cat < /dev/null
select 'create unlogged table p' || i || ' partition of parent for values in (' || i || ')' from generate_series(0,$childs) i; \gexec
EOF

echo "insert into parent select ((random() * 1000)::int % $childs) from generate_series(1,$NDATA)" > data.sql

pgbench -i -n postgres
avg=0
total=0
for t in `seq 1 $NTRY`
do
	tps=`bin/pgbench -T $TIME -c ${NCLIENTS} -f data.sql -n  postgres | grep "excluding" | cut -d " " -f 3`
	echo "CHILDS = $childs, TRIS = $t, TPS = $tps"
	total=$(echo "$tps + $total" | bc)
done
avg=$(echo "$total / $NTRY" | bc)
echo "nchilds = $childs tps = $avg" >> result_${1}.txt
done

pg_ctl stop


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Andres Freund
Hi,

On 2020-02-11 08:01:34 +0530, Amit Kapila wrote:
> I don't see much downside with the patch, rather there is a
> performance increase of 3-9% in various scenarios.

As I wrote in [1] I started to look at this patch. My problem with itis
that it just seems like the wrong direction architecturally to
me. There's two main aspects to this:

1) It basically builds a another, more lightweight but less capable, of
   a lock manager that can lock more objects than we can have distinct
   locks for.  It is faster because it uses *one* hashtable without
   conflict handling, because it has fewer lock modes, and because it
   doesn't support detecting deadlocks. And probably some other things.

2) A lot of the contention around file extension comes from us doing
   multiple expensive things under one lock (determining current
   relation size, searching victim buffer, extending file), and in tiny
   increments (growing a 1TB table by 8kb). This patch doesn't address
   that at all.

I've focused on 1) in the email referenced above ([1]). Here I'll focus
on 2).

To quantify my concerns I instrumented postgres to measure the time for
various operations that are part of extending a file (all per
process). The hardware is a pretty fast nvme, with unlogged tables, on a
20/40 core/threads machine. The workload is copying a scale 10
pgbench_accounts into an unindexed, unlogged table using pgbench.

Here are the instrumentations for various client counts, when just
measuring 20s:

1 client:
LOG:  extension time: lock wait: 0.00 lock held: 3.19 filesystem: 1.29 
buffersearch: 1.58

2 clients:
LOG:  extension time: lock wait: 0.47 lock held: 2.99 filesystem: 1.24 
buffersearch: 1.43
LOG:  extension time: lock wait: 0.60 lock held: 3.05 filesystem: 1.23 
buffersearch: 1.50

4 clients:
LOG:  extension time: lock wait: 3.92 lock held: 2.69 filesystem: 1.10 
buffersearch: 1.29
LOG:  extension time: lock wait: 4.40 lock held: 2.02 filesystem: 0.81 
buffersearch: 0.93
LOG:  extension time: lock wait: 3.86 lock held: 2.59 filesystem: 1.06 
buffersearch: 1.22
LOG:  extension time: lock wait: 4.00 lock held: 2.65 filesystem: 1.08 
buffersearch: 1.26

8 clients:
LOG:  extension time: lock wait: 6.94 lock held: 1.74 filesystem: 0.70 
buffersearch: 0.80
LOG:  extension time: lock wait: 7.16 lock held: 1.81 filesystem: 0.73 
buffersearch: 0.82
LOG:  extension time: lock wait: 6.93 lock held: 1.95 filesystem: 0.80 
buffersearch: 0.89
LOG:  extension time: lock wait: 7.08 lock held: 1.87 filesystem: 0.76 
buffersearch: 0.86
LOG:  extension time: lock wait: 6.95 lock held: 1.95 filesystem: 0.80 
buffersearch: 0.89
LOG:  extension time: lock wait: 6.88 lock held: 2.01 filesystem: 0.83 
buffersearch: 0.93
LOG:  extension time: lock wait: 6.94 lock held: 2.02 filesystem: 0.82 
buffersearch: 0.93
LOG:  extension time: lock wait: 7.02 lock held: 1.95 filesystem: 0.80 
buffersearch: 0.89

16 clients:
LOG:  extension time: lock wait: 10.37 lock held: 0.88 filesystem: 0.36 
buffersearch: 0.39
LOG:  extension time: lock wait: 10.53 lock held: 0.90 filesystem: 0.37 
buffersearch: 0.40
LOG:  extension time: lock wait: 10.72 lock held: 1.01 filesystem: 0.42 
buffersearch: 0.45
LOG:  extension time: lock wait: 10.45 lock held: 1.25 filesystem: 0.52 
buffersearch: 0.55
LOG:  extension time: lock wait: 10.66 lock held: 0.94 filesystem: 0.38 
buffersearch: 0.41
LOG:  extension time: lock wait: 10.50 lock held: 1.27 filesystem: 0.53 
buffersearch: 0.56
LOG:  extension time: lock wait: 10.53 lock held: 1.19 filesystem: 0.49 
buffersearch: 0.53
LOG:  extension time: lock wait: 10.57 lock held: 1.22 filesystem: 0.50 
buffersearch: 0.53
LOG:  extension time: lock wait: 10.72 lock held: 1.17 filesystem: 0.48 
buffersearch: 0.52
LOG:  extension time: lock wait: 10.67 lock held: 1.32 filesystem: 0.55 
buffersearch: 0.58
LOG:  extension time: lock wait: 10.95 lock held: 0.92 filesystem: 0.38 
buffersearch: 0.40
LOG:  extension time: lock wait: 10.81 lock held: 1.24 filesystem: 0.51 
buffersearch: 0.56
LOG:  extension time: lock wait: 10.62 lock held: 1.27 filesystem: 0.53 
buffersearch: 0.56
LOG:  extension time: lock wait: 11.14 lock held: 0.94 filesystem: 0.38 
buffersearch: 0.41
LOG:  extension time: lock wait: 11.20 lock held: 0.96 filesystem: 0.39 
buffersearch: 0.42
LOG:  extension time: lock wait: 10.75 lock held: 1.41 filesystem: 0.58 
buffersearch: 0.63
0.88 + 0.90 + 1.01 + 1.25 + 0.94 + 1.27 + 1.19 + 1.22 + 1.17 + 1.32 + 0.92 + 
1.24 + 1.27 + 0.94 + 0.96 + 1.41
in *none* of these cases the drive gets even close to being
saturated. Like not even 1/3.


If you consider the total time with the lock held, and the total time of
the test, it becomes very quickly obvious that pretty quickly we spend
the majority of the total time with the lock held.
client count 1: 3.18/20 = 0.16
client count 2: 6.04/20 = 0.30
client count 4: 9.95/20 = 0.50
client count 8: 15.30/20 = 0.76
client count 16: 17.89/20 = 0.89

In other words, the reason that relation extension 

Re: Add %x to PROMPT1 and PROMPT2

2020-02-11 Thread Michael Paquier
On Tue, Feb 11, 2020 at 10:05:25AM -0500, Robert Haas wrote:
> No objections here. I'm glad that we put in the effort to get more
> opinions, but I agree that an overall vote of ~58 to ~8 is a pretty
> strong consensus.

Clearly, so done as dcdbb5a.
--
Michael


signature.asc
Description: PGP signature


RE: [Proposal] Add accumulated statistics for wait event

2020-02-11 Thread imai.yoshik...@fujitsu.com
On Sat, Feb 1, 2020 at 5:50 AM, Pavel Stehule wrote:
> today I run 120 5minutes pgbench tests to measure impact of this patch. 
> Result is attached.
...
> Thanks to Tomas Vondra and 2ndq for hw for testing

Thank you for doing a lot of these benchmarks!

> The result is interesting - when I run pgbench in R/W mode, then I got +/- 1% 
> changes in performance. Isn't important if
> tracking time is active or not (tested on Linux). In this mode the new code 
> is not on critical path.

It seems performance difference is big in case of read only tests. The reason 
is that write time is relatively longer than the
processing time of the logic I added in the patch.

> Looks so for higher scale than 5 and higher number of users 50 the results 
> are not too much stable (for read only tests - I
> repeated tests) and there overhead is about 10% from 60K tps to 55Ktps - 
> maybe I hit a hw limits (it running with 4CPU)

Yes, I suspect some other bottlenecks may be happened and it causes the results 
unstable. However, it may be better to
investigate what is actually happened and why performance is 
increased/decreased for over 10%. I will inspect it.


Also I attach v5 patches which corresponds to other committed patches.

--
Yoshikazu Imai


0001-Add-pg_stat_waitaccum-view-v5.patch
Description: 0001-Add-pg_stat_waitaccum-view-v5.patch


0002-POC-Change-measuring-method-of-wait-event-time-fr-v5.patch
Description:  0002-POC-Change-measuring-method-of-wait-event-time-fr-v5.patch


Re: [PATCH] Replica sends an incorrect epoch in its hot standby feedback to the Master

2020-02-11 Thread Thomas Munro
On Fri, Feb 7, 2020 at 1:03 PM Palamadai, Eka  wrote:
> The below problem occurs in Postgres versions 11, 10, and 9.6. However, it 
> doesn’t occur since Postgres version 12, since the commit [6] to add basic 
> infrastructure for 64-bit transaction IDs indirectly fixed it.

I'm happy that that stuff is already fixing bugs we didn't know we
had, but, yeah, it looks like it really only fixed it incidentally by
moving all the duplicated "assign if higher" code into a function, not
through the magical power of 64 bit xids.

> The replica sends an incorrect epoch in its hot standby feedback to the 
> master in the scenario outlined below, where a checkpoint is interleaved with 
> the execution of 2 transactions at the master. The incorrect epoch in the 
> feedback causes the master to ignore the “oldest Xmin” X sent by the replica. 
> If a heap page prune[1] or vacuum were executed at the master immediately 
> thereafter, they may use a newer “oldest Xmin” Y > X,  and prematurely delete 
> a tuple T such that X < t_xmax (T) < Y, which is still in use at the replica 
> as part of a long running read query Q. Subsequently, when the replica 
> replays the deletion of T as part of its WAL replay, it cancels the long 
> running query Q causing unnecessary pain to customers.

Ouch.  Thanks for this analysis!

> The variable “ShmemVariableCache->nextXid” (or “nextXid” for short) should be 
> monotonically increasing unless it wraps around to the next epoch. However, 
> in the above sequence, this property is violated on the replica in the 
> function “RecordKnownAssignedTransactionIds”[3], when the WAL replay for the 
> insertion at step 6 is executed at the replica.

I haven't tried your repro or studied this closely yet, but yes, that
assignment to nextXid does indeed look pretty fishy.  Other similar
code elsewhere always does a check like in your patch, before
clobbering nextXid.




Re: Add A Glossary

2020-02-11 Thread Corey Huinker
>
> It seems like this could be a good idea, still the patch has been
> waiting on his author for more than two weeks now, so I have marked it
> as returned with feedback.
>

In light of feedback, I enlisted the help of an actual technical writer
(Roger Harkavy, CCed) and we eventually found the time to take a second
pass at this.

Attached is a revised patch.
From f087e44fe4db7996880cf4df982297018d444363 Mon Sep 17 00:00:00 2001
From: Corey Huinker 
Date: Wed, 12 Feb 2020 04:17:59 +
Subject: [PATCH] add glossary page with initial definitions

---
 doc/src/sgml/filelist.sgml |   1 +
 doc/src/sgml/glossary.sgml | 540 +
 doc/src/sgml/postgres.sgml |   1 +
 3 files changed, 542 insertions(+)
 create mode 100644 doc/src/sgml/glossary.sgml

diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 3da2365ea9..504c8a6326 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -170,6 +170,7 @@
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 00..1b881690fa
--- /dev/null
+++ b/doc/src/sgml/glossary.sgml
@@ -0,0 +1,540 @@
+
+
+
+ Glossary
+ 
+  This is a list of terms and their definitions in the context of PostgreSQL and databases in general.
+ 
+  
+   
+Aggregate
+
+ 
+  The act of combining a defined collection of data values into a single value that may not be the same type as the original values. Aggregate functions are most often used with Grouping operations which define the separate sets of data by the common values shared within those sets.
+ 
+
+   
+
+   
+Analytic
+
+ 
+  A function whose computed value can reference values found in nearby rows of the same result set.
+ 
+
+   
+
+   
+Atomic
+
+ 
+  When referring to the value of an attribute or datum: cannot be broken up into smaller components.
+ 
+ 
+  When referring to an operation: An event that cannot be partially completed; it must either completely succeed or completely fail. A series of SQL statements can be combined into a transaction, and that transaction is described as atomic.
+ 
+
+   
+
+   
+Attribute
+
+ 
+  A typed data element found within a tuple or relation or table.
+ 
+
+   
+
+   
+BYTEA
+
+ 
+  A data type for storing binary data. It is roughly analogous to the BLOB data type in other database products.
+ 
+
+   
+
+   
+Cast
+
+ 
+  The act of converting of a datum from its current data type to another data type.
+ 
+
+   
+
+   
+Check Constraint
+
+ 
+  A type of constraint defined for a relation which restricts the values allowed in one or more attributes. The check constraint can make reference to any attribute in the relation, but cannot reference other rows of the same relation or other relations.
+ 
+
+   
+
+   
+Column
+
+ 
+  An attribute found in a table or view.
+ 
+
+   
+
+   
+Commit
+
+ 
+  The act of finalizing a transaction within the database.
+ 
+
+   
+
+   
+Concurrency
+
+ 
+  The concept that multiple independent operations can be happening within the database at the same time.
+ 
+
+   
+
+   
+Constraint
+
+ 
+  A method of restricting the values of data allowed within a relation. Constraints can currently be of the following types: Check Constraint, Unique Constraint, and Exclusion Constraint.
+ 
+
+   
+
+   
+Datum
+
+ 
+  The internal representation of a SQL datatype.
+ 
+
+   
+
+   
+Delete
+
+ 
+  A SQL command that removes rows from a given table or relation.
+ 
+
+   
+
+   
+Exclusion Constraint
+
+ 
+  Exclusion constraints define both a set of columns for matching rows, and rules where values in one row would conflict with values in another.
+ 
+
+   
+
+   
+Foreign Data Wrapper
+
+ 
+  A means of representing data outside the local database so that it appears as if it were in local tables. With a Foreign Data Wrapper it is possible to define a Foreign Server and Foreign Tables.
+ 
+
+   
+
+   
+Foreign Key
+
+ 
+  A type of constraint defined on one or more columns in a table which requires the value in those columns to uniquely identify a row in the specified table.
+ 
+
+   
+
+   
+Foreign Server
+
+ 
+  A named collection of Foreign Tables which all use the same Foreign Data Wrapper and have other configured attributes in common.
+ 
+
+   
+
+   
+Foreign Table
+
+ 
+  A relation which appears to have rows and columns like a regular table, but when queried will instead forward the request for data through its Foreign Data Wrapper, which will return results structured according to the definition of the Foreign 

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Masahiko Sawada
On Wed, 12 Feb 2020 at 00:43, Tom Lane  wrote:
>
> I took a brief look through this patch.  I agree with the fundamental
> idea that we shouldn't need to use the heavyweight lock manager for
> relation extension, since deadlock is not a concern and no backend
> should ever need to hold more than one such lock at once.  But it feels
> to me like this particular solution is rather seriously overengineered.
> I would like to suggest that we do something similar to Robert Haas'
> excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> that is,
>
> * Create some predetermined number N of LWLocks for relation extension.

My original proposal used LWLocks and hash tables for relation
extension but there was a discussion that using LWLocks is not good
because it's not interruptible[1]. Because of this reason and that we
don't need to have two lock level (shared, exclusive) for relation
extension lock we ended up with implementing dedicated lock manager
for extension lock. I think we will have that problem if we use LWLocks.

Regards,

[1] 
https://www.postgresql.org/message-id/CA%2BTgmoZnWYQvmeqeGyY%2B0j-Tfmx8cTzRadfxJQwK9A-nCQ7GkA%40mail.gmail.com



--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Adding a test for speculative insert abort case

2020-02-11 Thread Melanie Plageman
On Tue, Feb 11, 2020 at 4:45 PM Andres Freund  wrote:

>
> I additionally restricted the controller_print_speculative_locks step to
> the current database and made a bunch of debug output more
> precise. Survived ~150 runs locally.
>
> Lets see what the buildfarm says...
>
>
Thanks so much for finishing the patch and checking for race
conditions!

-- 
Melanie Plageman


RE: POC: GUC option for skipping shared buffers in core dumps

2020-02-11 Thread tsunakawa.ta...@fujitsu.com
From: Craig Ringer 
> Currently my options are "dump all shmem including shared_buffers" or
> "dump no shmem". But I usually want "dump all shmem except
> shared_buffers". It's tolerable to just dump s_b on a test system with
> a small s_b, but if enabling coredumps to track down some
> hard-to-repro crash on a production system I really don't want 20GB
> coredumps...

We have a simple implementation that allows to exclude shared memory.  That's 
been working for years.

[postgresql.conf]
core_directory = 'location of core dumps'
core_contents = '{none | minimum | full}'
# none = doesn't dump core, minimum excludes shared memory, and full dumps all

I can provide it.  But it simply changes the current directory and detaches 
shared memory when postgres receives signals that dump core.

I made this GUC because Windows also had to be dealt with.


From: Andres Freund 
> > Hah.  This argument boils down to saying our packagers suck :-)
> 
> Hm? I'd say it's a sign of respect to not have each of them do the same
> work. Especially when they can't address it to the same degree core PG
> can. So maybe I'm saying we shouldn't be lazy ;)

Maybe we should add options to pg_ctl just like -c which is available now, so 
that OS packagers can easily use in their start scripts.  Or, can they just use 
pg_ctl's -o to specify new GUC parameters?


Regards
Takayuki Tsunakawa



Re: POC: rational number type (fractions)

2020-02-11 Thread Jeff Davis
On Fri, 2020-02-07 at 22:25 -0600, Joe Nelson wrote:
> Hi hackers, attached is a proof of concept patch adding a new base
> type
> called "rational" to represent fractions.

Hi!

> The primary motivation was as a column type to support user-defined
> ordering of rows (with the ability to dynamically rearrange rows).
> The
> postgres wiki has a page [0] about this using pairs of integers to
> represent fractions, but it's not particularly elegant.

Sounds good.

> I wrote about options for implementing user-defined order in an
> article
> [1] and ended up creating a postgres extension, pg_rational [2], to
> provide the new type. People have been using the extension, but told
> me
> they wished they could use it on hosted platforms like Amazon RDS
> which
> have a limited set of whitelisted extensions. Thus I'm submitting
> this
> patch to discuss getting the feature in core postgres.

The decision between an extension and a core type is a tricky one. To
put an extension in core, usually it's good to show either that it
satisfies something in the SQL standard, or that there is some specific
technical advantage (like integration with the syntax or the type
system).

Integrating it in core just to make it easier to use is a double-edge
sword. It does make it easier in some environments; but it also removes
pressure to make those environments offer better support for the
extension ecosystem, ultimately weakening extensions overall.

In this case I believe it could be a candidate for in-core, but it's
borderline. The reasons it makes sense to me are:

1. It seems that there's "only one way to do it". It would be good to
validate that this really covers most of the use cases of rational
numbers, but if so, that makes it a better candidate for building it
into core. It would also be good to compare against other
implementations (perhaps in normal programming languages) to see if
there is anything interesting.

2. I don't expect this will be much of a maintenance burden.

Keep in mind that if you do want this to be in core, the data format
has to be very stable to maintain pg_upgrade compatibility.


Patch comments:

* Please include docs.

* I'm worried about the use of int32. Does that cover all of the
reasonable use cases of rational?

* Shouldn't:

/*
 * x = coalesce(lo, arg[0]) y = coalesce(hi, arg[1])
 */

  be: 

/*
 * x = coalesce(arg[0], lo) y = coalesce(arg[1], lo)
 */

* More generally, what's the philosophy regarding NULL and rational?
Why are NULL arguments returning non-NULL answers?

* Is rational_intermediate() well-defined, or can it just choose any
rational between the two arguments?

* Can you discuss how cross-type comparisons and conversions should be
handled (e.g. int8, numeric, float8)?

Regards,
Jeff Davis






Re: Adding a test for speculative insert abort case

2020-02-11 Thread Andres Freund
Hi,

On 2020-02-07 16:40:46 -0800, Andres Freund wrote:
> I'm currently fighting with a race I'm observing in about 1/4 of the
> runs. [...]
> I think the issue here is that determines whether s1 can finish its
> check_exclusion_or_unique_constraint() check with one retry is whether
> it reaches it does the tuple visibility test before s2's transaction has
> actually marked itself as visible (note that ProcArrayEndTransaction is
> after RecordTransactionCommit logging the COMMIT above).
> 
> I think the fix is quite easy: Ensure that there *always* will be the
> second wait iteration on the transaction (in addition to the already
> always existing wait on the speculative token). Which is just adding
> s2_begin s2_commit steps.   Simple, but took me a few hours to
> understand :/.
> 
> I've attached that portion of my changes. Will interrupt scheduled
> programming for a bit of exercise now.

I've pushed this now. Thanks for the patch, and the review!

I additionally restricted the controller_print_speculative_locks step to
the current database and made a bunch of debug output more
precise. Survived ~150 runs locally.

Lets see what the buildfarm says...

Regards,

Andres




Re: Error on failed COMMIT

2020-02-11 Thread Dave Cramer
On Tue, 11 Feb 2020 at 17:35, Tom Lane  wrote:

> Vik Fearing  writes:
> > There is a current discussion off-list about what should happen when a
> > COMMIT is issued for a transaction that cannot be committed for whatever
> > reason.  PostgreSQL returns ROLLBACK as command tag but otherwise
> succeeds.
>
> > It seems like [ trying to commit a failed transaction ]
> > should actually produce something like this:
>
> > postgres=!# commit;
> > ERROR:  40P00: transaction cannot be committed
> > DETAIL:  First error was "42601: syntax error at or near "error""
>
> So I assume you're imagining that that would leave us still in
> transaction-aborted state, and the session is basically dead in
> the water until the user thinks to issue ROLLBACK instead?
>
> > Is this reading correct?
>
> Probably it is, according to the letter of the SQL spec, but I'm
> afraid that changing this behavior now would provoke lots of hate
> and few compliments.  An application that's doing the spec-compliant
> thing and issuing ROLLBACK isn't going to be affected, but apps that
> are relying on the existing behavior are going to be badly broken.
>
> A related problem is what happens if you're in a perfectly-fine
> transaction and the commit itself fails, e.g.,
>
> regression=# create table tt (f1 int primary key deferrable initially
> deferred);
> CREATE TABLE
> regression=# begin;
> BEGIN
> regression=# insert into tt values (1);
> INSERT 0 1
> regression=# insert into tt values (1);
> INSERT 0 1
> regression=# commit;
> ERROR:  duplicate key value violates unique constraint "tt_pkey"
> DETAIL:  Key (f1)=(1) already exists.
>
> At this point PG considers that you're out of the transaction:
>
> regression=# rollback;
> WARNING:  there is no transaction in progress
> ROLLBACK
>

interesting as if you do a commit after violating a not null it simply does
a rollback
with no warning whatsoever

begin;
BEGIN
test=# insert into hasnull(i) values (null);
ERROR:  null value in column "i" violates not-null constraint
DETAIL:  Failing row contains (null).
test=# commit;
ROLLBACK


>
> but I bet the spec doesn't.  So if we change that, again we break
> applications that work today.  Meanwhile, an app that is doing it
> the spec-compliant way will issue a ROLLBACK that we consider
> useless, so currently that draws an ignorable WARNING and all is
> well.  So here also, the prospects for making more people happy
> than we make unhappy seem pretty grim.  (Maybe there's a case
> for downgrading the WARNING to NOTICE, though?)
>
> Actually the bug reporter was looking for an upgrade from a warning to an
ERROR

I realize we are unlikely to change the behaviour however it would be
useful if we
did the same thing for all cases, and document this behaviour. We actually
have places where
we document where we don't adhere to the spec.

Dave


Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-11 Thread Fujii Masao




On 2020/02/11 0:28, Alvaro Herrera wrote:

On 2020-Feb-10, Alvaro Herrera wrote:


On 2020-Feb-10, Fujii Masao wrote:



Yes! Thanks for pointing out that!
So the patch needs to be applied only in master.


Yikes, thanks. Pushing in a minute.


Actually, if you want to push it, be my guest.


Yeah, I pushed the patch. Thanks!

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




RE: open-source equivalent of golden-gate

2020-02-11 Thread tsunakawa.ta...@fujitsu.com
From: Chapman Flack 
> I read the question as perhaps concerning the other direction, whether
> there might be an open source foreign data wrapper installable in Oracle
> for talking to PostgreSQL (which might, I suppose, also have a name like
> "postgres_fdw", which helps explain the number of times I've rewritten
> this sentence trying to make it unambiguous).

Oracle Database Gateway for ODBC can be used:


Oracle Database Gateway for PostgreSQL - ORACLE-HELP
http://oracle-help.com/oracle-database/oracle-database-gateway-postgresql/


Regards
Takayuki Tsunakawa



Re: Error on failed COMMIT

2020-02-11 Thread Tom Lane
Vik Fearing  writes:
> On 11/02/2020 23:35, Tom Lane wrote:
>> So I assume you're imagining that that would leave us still in
>> transaction-aborted state, and the session is basically dead in
>> the water until the user thinks to issue ROLLBACK instead?

> Actually, I was imagining that it would end the transaction as it does
> today, just with an error code.
> This is backed up by General Rule 9 which says "The current
> SQL-transaction is terminated."

Hm ... that would be sensible, but I'm not entirely convinced.  There
are several preceding rules that say that an exception condition is
raised, and normally you can stop reading at that point; nothing else
is going to happen.  If COMMIT acts specially in this respect, they
ought to say so.

In any case, while this interpretation might change the calculus a bit,
I think we still end up concluding that altering this behavior has more
downside than upside.

regards, tom lane




Re: Error on failed COMMIT

2020-02-11 Thread Vik Fearing
On 11/02/2020 23:35, Tom Lane wrote:
> Vik Fearing  writes:
>> There is a current discussion off-list about what should happen when a
>> COMMIT is issued for a transaction that cannot be committed for whatever
>> reason.  PostgreSQL returns ROLLBACK as command tag but otherwise succeeds.
> 
>> It seems like [ trying to commit a failed transaction ]
>> should actually produce something like this:
> 
>> postgres=!# commit;
>> ERROR:  40P00: transaction cannot be committed
>> DETAIL:  First error was "42601: syntax error at or near "error""
> 
> So I assume you're imagining that that would leave us still in
> transaction-aborted state, and the session is basically dead in
> the water until the user thinks to issue ROLLBACK instead?

Actually, I was imagining that it would end the transaction as it does
today, just with an error code.

This is backed up by General Rule 9 which says "The current
SQL-transaction is terminated."

>> Is this reading correct?
> 
> Probably it is, according to the letter of the SQL spec, but I'm
> afraid that changing this behavior now would provoke lots of hate
> and few compliments.  An application that's doing the spec-compliant
> thing and issuing ROLLBACK isn't going to be affected, but apps that
> are relying on the existing behavior are going to be badly broken.

I figured that was likely.  I'm hoping to at least get a documentation
patch out of this thread, though.

> A related problem is what happens if you're in a perfectly-fine
> transaction and the commit itself fails, e.g.,
> 
> regression=# create table tt (f1 int primary key deferrable initially 
> deferred);
> CREATE TABLE
> regression=# begin;
> BEGIN
> regression=# insert into tt values (1);
> INSERT 0 1
> regression=# insert into tt values (1);
> INSERT 0 1
> regression=# commit;
> ERROR:  duplicate key value violates unique constraint "tt_pkey"
> DETAIL:  Key (f1)=(1) already exists.
> 
> At this point PG considers that you're out of the transaction:
> 
> regression=# rollback;
> WARNING:  there is no transaction in progress
> ROLLBACK
> 
> but I bet the spec doesn't.  So if we change that, again we break
> applications that work today.

I would argue that the this example is entirely compliant and consistent
with my original question (except that it gives a class 23 instead of a
class 40).

> Meanwhile, an app that is doing it
> the spec-compliant way will issue a ROLLBACK that we consider
> useless, so currently that draws an ignorable WARNING and all is
> well.  So here also, the prospects for making more people happy
> than we make unhappy seem pretty grim.

I'm not entirely sure what should happen with a free-range ROLLBACK. (I
*think* it says it should error with "2D000 invalid transaction
termination" but it's a little confusing to me.)

> (Maybe there's a case for downgrading the WARNING to NOTICE, though?)

Maybe.  But I think its match (a double START TRANSACTION) should remain
a warning if we do change this.

> (Don't even *think* of suggesting that having a GUC to change
> this behavior would be appropriate.  The long-ago fiasco around
> autocommit showed us the hazards of letting GUCs affect such
> fundamental behavior.)

That thought never crossed my mind.

> Speaking of autocommit, I wonder how that would interact with
> this...

I don't see how it would be any different.
-- 
Vik Fearing




Re: Error on failed COMMIT

2020-02-11 Thread Tom Lane
Vik Fearing  writes:
> There is a current discussion off-list about what should happen when a
> COMMIT is issued for a transaction that cannot be committed for whatever
> reason.  PostgreSQL returns ROLLBACK as command tag but otherwise succeeds.

> It seems like [ trying to commit a failed transaction ]
> should actually produce something like this:

> postgres=!# commit;
> ERROR:  40P00: transaction cannot be committed
> DETAIL:  First error was "42601: syntax error at or near "error""

So I assume you're imagining that that would leave us still in
transaction-aborted state, and the session is basically dead in
the water until the user thinks to issue ROLLBACK instead?

> Is this reading correct?

Probably it is, according to the letter of the SQL spec, but I'm
afraid that changing this behavior now would provoke lots of hate
and few compliments.  An application that's doing the spec-compliant
thing and issuing ROLLBACK isn't going to be affected, but apps that
are relying on the existing behavior are going to be badly broken.

A related problem is what happens if you're in a perfectly-fine
transaction and the commit itself fails, e.g.,

regression=# create table tt (f1 int primary key deferrable initially deferred);
CREATE TABLE
regression=# begin;
BEGIN
regression=# insert into tt values (1);
INSERT 0 1
regression=# insert into tt values (1);
INSERT 0 1
regression=# commit;
ERROR:  duplicate key value violates unique constraint "tt_pkey"
DETAIL:  Key (f1)=(1) already exists.

At this point PG considers that you're out of the transaction:

regression=# rollback;
WARNING:  there is no transaction in progress
ROLLBACK

but I bet the spec doesn't.  So if we change that, again we break
applications that work today.  Meanwhile, an app that is doing it
the spec-compliant way will issue a ROLLBACK that we consider
useless, so currently that draws an ignorable WARNING and all is
well.  So here also, the prospects for making more people happy
than we make unhappy seem pretty grim.  (Maybe there's a case
for downgrading the WARNING to NOTICE, though?)

(Don't even *think* of suggesting that having a GUC to change
this behavior would be appropriate.  The long-ago fiasco around
autocommit showed us the hazards of letting GUCs affect such
fundamental behavior.)

Speaking of autocommit, I wonder how that would interact with
this...

regards, tom lane




Re: Memory-comparable Serialization of Data Types

2020-02-11 Thread Peter Geoghegan
On Tue, Feb 11, 2020 at 1:40 PM Alvaro Herrera  wrote:
> I think adding that would be too much of a burden, both for the project
> itself as for third-party type definitions; I think we'd rather rely on
> calling the BTORDER_PROC btree support function for the type.

An operator class would still need to provide a BTORDER_PROC. What I
describe would be an optional capability. This is something that I
have referred to as key normalization in the past:

https://wiki.postgresql.org/wiki/Key_normalization

I think that it would only make sense as an enabler of multiple
optimizations -- not just the memcmp()/strcmp() thing. A common
strcmp()'able binary string format can be used in many different ways.
Note that this has nothing to do with changing the representation used
by the vast majority of all tuples -- just the pivot tuples, which are
mostly located in internal pages. They only make up less than 1% of
all pages in almost all cases.

I intend to prototype this technique within the next year. It's
possible that it isn't worth the trouble, but there is only one way to
find out. I might just work on the "abbreviated keys in internal
pages" thing, for example. Though you really need some kind of prefix
compression to make that effective.

-- 
Peter Geoghegan




Re: Implementing Incremental View Maintenance

2020-02-11 Thread legrand legrand
Takuma Hoshiai wrote
> Hi, 
> 
> Attached is the latest patch (v12) to add support for Incremental
> Materialized View Maintenance (IVM).
> It is possible to apply to current latest master branch.
> 
> Differences from the previous patch (v11) include:
> * support executing REFRESH MATERIALIZED VIEW command with IVM.
> * support unscannable state by WITH NO DATA option.
> * add a check for LIMIT/OFFSET at creating an IMMV
> 
>  If REFRESH is executed for IMMV (incremental maintainable materialized
> view),  its contents is re-calculated as same as usual materialized views
> (full REFRESH). Although IMMV is basically keeping up-to-date data,
> rounding errors can be accumulated in aggregated value in some cases, for
> example, if the view contains sum/avg on float type columns. Running
> REFRESH command on IMMV will resolve this. Also, WITH NO DATA option
> allows to make IMMV unscannable. At that time,  IVM triggers are dropped
> from IMMV because these become unneeded and useless. 
> 
> [...]

Hello,

regarding syntax REFRESH MATERIALIZED VIEW x WITH NO DATA

I understand that triggers are removed from the source tables, transforming 
the INCREMENTAL MATERIALIZED VIEW into a(n unscannable) MATERIALIZED VIEW.

postgres=# refresh materialized view imv with no data;
REFRESH MATERIALIZED VIEW
postgres=# select * from imv;
ERROR:  materialized view "imv" has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.

This operation seems to me more of an ALTER command than a REFRESH ONE.

Wouldn't the syntax
ALTER MATERIALIZED VIEW [ IF EXISTS ] name
SET WITH NO DATA
or
SET WITHOUT DATA
be better ?

Continuing into this direction, did you ever think about an other feature
like:  
ALTER MATERIALIZED VIEW [ IF EXISTS ] name
SET { NOINCREMENTAL }
or even
SET { NOINCREMENTAL | INCREMENTAL | INCREMENTAL CONCURRENTLY  }

that would permit to switch between those modes and would keep frozen data 
available in the materialized view during heavy operations on source tables
?

Regards
PAscal 



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-11 Thread Ranier Vilela
Em ter., 11 de fev. de 2020 às 18:08, Andrew Dunstan <
andrew.duns...@2ndquadrant.com> escreveu:

>
> On 2/10/20 7:13 AM, Ranier Vilela wrote:
> >
> >
> > Unfortunately, i will have to live with 32 bits clients for a long
> > time yet.
> > I still have customers using Windows XP yet ...
> >
> >
>
>
> AFAIK we don't support WinXP past Postgres Release 10 because of the
> lack of huge page support. That won't affect clients, but it does mean
> we won't build or test later releases on XP.
>
> Oh yes of course, I understand.
I support 32 bits clients that still use WIndows XP, with version 9.6, for
both client and server.
Unfortunately, the notorious Windows 7 32 bits is still in use.
For these I am migrating to version 12, both for client and server.

regards,
Ranier Vilela


Error on failed COMMIT

2020-02-11 Thread Vik Fearing
There is a current discussion off-list about what should happen when a
COMMIT is issued for a transaction that cannot be committed for whatever
reason.  PostgreSQL returns ROLLBACK as command tag but otherwise succeeds.

Here is an excerpt of Section 17.7  that I feel is
relevant:

<>
6) Case:

a) If any enforced constraint is not satisfied, then any changes to
SQL-data or schemas that were made by the current SQL-transaction are
canceled and an exception condition is raised: transaction rollback —
integrity constraint violation.

b) If any other error preventing commitment of the SQL-transaction has
occurred, then any changes to SQL-data or schemas that were made by the
current SQL-transaction are canceled and an exception condition is
raised: transaction rollback with an implementation-defined subclass value.

c) Otherwise, any changes to SQL-data or schemas that were made by the
current SQL-transaction are eligible to be perceived by all concurrent
and subsequent SQL-transactions.


It seems like this:

postgres=# \set VERBOSITY verbose
postgres=# begin;
BEGIN
postgres=*# error;
ERROR:  42601: syntax error at or near "error"
LINE 1: error;
^
LOCATION:  scanner_yyerror, scan.l:1150
postgres=!# commit;
ROLLBACK

should actually produce something like this:

postgres=!# commit;
ERROR:  40P00: transaction cannot be committed
DETAIL:  First error was "42601: syntax error at or near "error""

Is this reading correct?
If so, is this something we should fix?
-- 
Vik Fearing




Re: Just for fun: Postgres 20?

2020-02-11 Thread marcelo zen
I'd rather have releases being made when the software is ready and not when
the calendar year mandates it.
It seems like a terrible idea.

On Tue, 11 Feb 2020 at 14:03, Andreas Joseph Krogh 
wrote:

> This project already tried that:
> https://www.postgresql.org/docs/12/history.html#HISTORY-POSTGRES95
> Didn't last long...
>
> --
> Andreas Joseph Krogh
>


Re: Memory-comparable Serialization of Data Types

2020-02-11 Thread Alvaro Herrera
On 2020-Feb-11, Peter Geoghegan wrote:

> On Tue, Feb 11, 2020 at 12:19 PM Shichao Jin  wrote:
> > Yes, this is exactly what I mean.
> 
> PostgreSQL doesn't have this capability. It might make sense to have
> it for some specific data structures,

I think adding that would be too much of a burden, both for the project
itself as for third-party type definitions; I think we'd rather rely on
calling the BTORDER_PROC btree support function for the type.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Memory-comparable Serialization of Data Types

2020-02-11 Thread Peter Geoghegan
On Tue, Feb 11, 2020 at 12:19 PM Shichao Jin  wrote:
> Yes, this is exactly what I mean.

PostgreSQL doesn't have this capability. It might make sense to have
it for some specific data structures, such as tuples on internal
B-Tree pages -- these merely guide index scans, so there some
information loss may be acceptable compared to the native/base
representation. However, that would only be faster because memcmp() is
generally faster than the underlying datatype's native comparator. Not
because comparisons have to take place in "the upper levels". There is
some indirection/overhead involved in using SQL-callable operators,
but not that much.

Note that such a representation has to lose information in at least
some cases. For example, case-insensitive collations would have to
lose information about the original case used (or store the original
alongside the conditioned binary string). Note also that a "one pass"
representation that we can just memcmp() will have to be significantly
larger in some cases, especially when collatable text is used. A
strxfrm() blob is typically about 3.3x larger than the original string
IIRC.

-- 
Peter Geoghegan




Re: Postgres 32 bits client compilation fail. Win32 bits client is supported?

2020-02-11 Thread Andrew Dunstan


On 2/10/20 7:13 AM, Ranier Vilela wrote:
>
>
> Unfortunately, i will have to live with 32 bits clients for a long
> time yet.
> I still have customers using Windows XP yet ...
>
>


AFAIK we don't support WinXP past Postgres Release 10 because of the
lack of huge page support. That won't affect clients, but it does mean
we won't build or test later releases on XP.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Portal->commandTag as an enum

2020-02-11 Thread Mark Dilger



> On Feb 11, 2020, at 1:02 PM, Alvaro Herrera  wrote:
> 
> On 2020-Feb-11, Mark Dilger wrote:
> 
>>> No thanks.
>> 
>> I’m not sure which option you are voting for:
>> 
>> (Option #1) Have the perl script generate the .c and .h file from a .dat file
>> 
>> (Option #2) Have the perl script validate but not generate the .c and .h 
>> files
>> 
>> (Option #3) Have no perl script, with all burden on the programmer to get 
>> the .c and .h files right by hand.
>> 
>> I think you’re voting against #3, and I’m guessing you’re voting for #1, but 
>> I’m not certain.
> 
> I was voting against #2 (burden the programmer with consistency checks
> that must be fixed by hand, without actually doing the programmatically-
> doable work), but I don't like #3 either.  I do like #1.

Option #1 works for me.  If I don’t see any contrary votes before I get back to 
this patch, I’ll implement it that way for the next version.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Portal->commandTag as an enum

2020-02-11 Thread Alvaro Herrera
On 2020-Feb-11, Mark Dilger wrote:

> > No thanks.
> 
> I’m not sure which option you are voting for:
> 
> (Option #1) Have the perl script generate the .c and .h file from a .dat file
> 
> (Option #2) Have the perl script validate but not generate the .c and .h files
> 
> (Option #3) Have no perl script, with all burden on the programmer to get the 
> .c and .h files right by hand.
> 
> I think you’re voting against #3, and I’m guessing you’re voting for #1, but 
> I’m not certain.

I was voting against #2 (burden the programmer with consistency checks
that must be fixed by hand, without actually doing the programmatically-
doable work), but I don't like #3 either.  I do like #1.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Portal->commandTag as an enum

2020-02-11 Thread Mark Dilger



> On Feb 11, 2020, at 12:50 PM, Alvaro Herrera  wrote:
> 
> On 2020-Feb-11, Mark Dilger wrote:
> 
>> I thought about generating the files rather than merely checking them.
>> I could see arguments both ways.  I wasn’t sure whether there would be
>> broad support for having yet another perl script generating source
>> files, nor for the maintenance burden of having to do that on all
>> platforms.  Having a perl script that merely sanity checks the source
>> files has the advantage that there is no requirement for it to
>> function on all platforms.  There’s not even a requirement for it to
>> be committed to the tree, since you could also argue that the
>> maintenance burden of the script outweighs the burden of getting the
>> source files right by hand.
> 
> No thanks.

I’m not sure which option you are voting for:

(Option #1) Have the perl script generate the .c and .h file from a .dat file

(Option #2) Have the perl script validate but not generate the .c and .h files

(Option #3) Have no perl script, with all burden on the programmer to get the 
.c and .h files right by hand.

I think you’re voting against #3, and I’m guessing you’re voting for #1, but 
I’m not certain.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Portal->commandTag as an enum

2020-02-11 Thread Alvaro Herrera
On 2020-Feb-11, Mark Dilger wrote:

> I thought about generating the files rather than merely checking them.
> I could see arguments both ways.  I wasn’t sure whether there would be
> broad support for having yet another perl script generating source
> files, nor for the maintenance burden of having to do that on all
> platforms.  Having a perl script that merely sanity checks the source
> files has the advantage that there is no requirement for it to
> function on all platforms.  There’s not even a requirement for it to
> be committed to the tree, since you could also argue that the
> maintenance burden of the script outweighs the burden of getting the
> source files right by hand.

No thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_locks display of speculative locks is bogus

2020-02-11 Thread Andres Freund
Hi,

On 2020-02-11 12:24:50 -0800, Peter Geoghegan wrote:
> On Tue, Feb 11, 2020 at 12:03 PM Andres Freund  wrote:
> > Doesn't seem great.
> >
> > It's trivial to put the xid in the correct place, but it's less obvious
> > what to do with the token? For master we should probably add a column,
> > but what about the back branches? Ignore it? Put it in classid or such?
> 
> My vote goes to doing nothing about the token on the back branches.
> Just prevent bogus pg_locks output.
> 
> Nobody cares about the specifics of the token value -- though perhaps
> you foresee a need to have it for testing purposes. I suppose that
> adding a column to pg_locks on the master branch is the easy way of
> resolving the situation, even if we don't really expect anyone to use
> it.

You can't really analyze what is waiting for what without seeing it -
the prime purpose of pg_locks. So I don't agree with the sentiment that
nobody cares about the token.

Greetings,

Andres Freund




Re: Portal->commandTag as an enum

2020-02-11 Thread Mark Dilger



> On Feb 11, 2020, at 11:09 AM, Alvaro Herrera  wrote:
> 
> On 2020-Feb-07, Mark Dilger wrote:
> 
>> Andres,
>> 
>> The previous patch set seemed to cause confusion, having separated
>> changes into multiple files.  The latest patch, heavily influenced by
>> your review, is all in one file, attached.
> 
> Cool stuff.

Thanks for the review!

> I think is a little confused about what is source and what is generated.

The perl file generates nothing.  It merely checks that the .h and .c files are 
valid and consistent with each other.

> I'm not clear why commandtag.c is a C file at all; wouldn't it be
> simpler to have it as a Data::Dumper file or some sort of Perl struct,
> so that it can be read easily by the Perl file?  Similar to the
> src/include/catalog/pg_*.dat files.  That script can then generate all
> the needed .c and .h files, which are not going to be part of the source
> tree, and will always be in-sync and won't have the formatting
> strictness about it.  And you won't have the Martian syntax you had to
> use in the commandtag.c file.

I thought about generating the files rather than merely checking them.  I could 
see arguments both ways.  I wasn’t sure whether there would be broad support 
for having yet another perl script generating source files, nor for the 
maintenance burden of having to do that on all platforms.  Having a perl script 
that merely sanity checks the source files has the advantage that there is no 
requirement for it to function on all platforms.  There’s not even a 
requirement for it to be committed to the tree, since you could also argue that 
the maintenance burden of the script outweighs the burden of getting the source 
files right by hand.

> As for API, please don't shorten things such as SetQC, just use
> SetQueryCompletion.  Perhaps call it SetCompletionTag or SetCommandTag?.
> (I'm not sure about the name "QueryCompletionData"; maybe CommandTag or
> QueryTag would work better for that struct.

I am happy to rename it as SetQueryCompletion.

> There seems to be a lot of
> effort in separating QueryCompletion from CommandTag; is that really
> necessary?)

For some code paths, prior to this patch, the commandTag gets changed before 
returning, and I’m not just talking about the change where the rowcount gets 
written into the commandTag string.  I have a work-in-progress patch to provide 
system views to track the number of commands executed of each type, and that 
patch includes the ability to distinguish between what the command started as 
and what it completed as, so I do want to keep those concepts separate.  I 
rejected the “SetCommandTag” naming suggestion above because we’re really 
setting information about the completion (what it finished as) and not the 
command (what it started as).  I rejected the “SetCompletionTag” naming because 
it’s not just the tag that is being set, but both the tag and the row count.  I 
am happy with “SetQueryCompletion” because that naming is consistent with 
setting the pair of values.

> Lastly, we have a convention that we have a struct called
> FooData, and a typedef FooData *Foo, then use the latter in the API.
> We don't adhere to that 100%, and some people dislike it, but I'd rather
> be consistent and not be passing "FooData *" around; it's just noisier.

I’m familiar with the convention, and don’t like it, so I’ll have to look at a 
better way of naming this.  I specifically don’t like it because it makes a 
mess of using the const qualifier.

Once again, thanks for the review!  I will work to get another version of this 
patch posted around the time I post (separately) the command stats patch.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg_locks display of speculative locks is bogus

2020-02-11 Thread Peter Geoghegan
On Tue, Feb 11, 2020 at 12:03 PM Andres Freund  wrote:
> Doesn't seem great.
>
> It's trivial to put the xid in the correct place, but it's less obvious
> what to do with the token? For master we should probably add a column,
> but what about the back branches? Ignore it? Put it in classid or such?

My vote goes to doing nothing about the token on the back branches.
Just prevent bogus pg_locks output.

Nobody cares about the specifics of the token value -- though perhaps
you foresee a need to have it for testing purposes. I suppose that
adding a column to pg_locks on the master branch is the easy way of
resolving the situation, even if we don't really expect anyone to use
it.

-- 
Peter Geoghegan




Re: Memory-comparable Serialization of Data Types

2020-02-11 Thread Shichao Jin
Yes, this is exactly what I mean.

On Tue, 11 Feb 2020 at 15:01, Peter Geoghegan  wrote:

> On Tue, Feb 11, 2020 at 11:53 AM Shichao Jin  wrote:
> > We are currently integrating LSM-tree based storage engine RocksDB into
> Postgres. I am wondering is there any function that serialize data types in
> memory-comparable format, similar to MySQL and MariaDB. With that kind of
> function, we can directly store the serialized format in the storage engine
> and compare them in the storage engine level instead of deserializing data
> and comparing in the upper level.
>
> Do you mean a format that can perform Index comparisons using a
> memcmp() rather than per-datatype comparison code?
>
>
>
> --
> Peter Geoghegan
>


pg_locks display of speculative locks is bogus

2020-02-11 Thread Andres Freund
Hi,

I noticed this when tightening up some races for [1] I noticed that the
way speculative locks are displayed in pg_locks is completely bogus. As
pg_locks has no branch specific to speculative locks, the object etc
path is used:
case LOCKTAG_OBJECT:
case LOCKTAG_USERLOCK:
case LOCKTAG_ADVISORY:
default:/* treat unknown 
locktags like OBJECT */
values[1] = 
ObjectIdGetDatum(instance->locktag.locktag_field1);
values[7] = 
ObjectIdGetDatum(instance->locktag.locktag_field2);
values[8] = 
ObjectIdGetDatum(instance->locktag.locktag_field3);
values[9] = 
Int16GetDatum(instance->locktag.locktag_field4);
nulls[2] = true;
nulls[3] = true;
nulls[4] = true;
nulls[5] = true;
nulls[6] = true;
break;

but speculative locks are defined like:

/*
 * ID info for a speculative insert is TRANSACTION info +
 * its speculative insert counter.
 */
#define SET_LOCKTAG_SPECULATIVE_INSERTION(locktag,xid,token) \
((locktag).locktag_field1 = (xid), \
 (locktag).locktag_field2 = (token),\
 (locktag).locktag_field3 = 0, \
 (locktag).locktag_field4 = 0, \
 (locktag).locktag_type = LOCKTAG_SPECULATIVE_TOKEN, \
 (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)

which means that currently a speculative lock's xid is displayed as the
database, the token as the classid, and that objid and objsubid are 0
instead of NULL.

Doesn't seem great.

It's trivial to put the xid in the correct place, but it's less obvious
what to do with the token? For master we should probably add a column,
but what about the back branches? Ignore it? Put it in classid or such?

Greetings,

Andres Freund

[1] 
https://www.postgresql.org/message-id/CAAKRu_ZRmxy_OEryfY3G8Zp01ouhgw59_-_Cm8n7LzRH5BAvng%40mail.gmail.com




Re: Memory-comparable Serialization of Data Types

2020-02-11 Thread Peter Geoghegan
On Tue, Feb 11, 2020 at 11:53 AM Shichao Jin  wrote:
> We are currently integrating LSM-tree based storage engine RocksDB into 
> Postgres. I am wondering is there any function that serialize data types in 
> memory-comparable format, similar to MySQL and MariaDB. With that kind of 
> function, we can directly store the serialized format in the storage engine 
> and compare them in the storage engine level instead of deserializing data 
> and comparing in the upper level.

Do you mean a format that can perform Index comparisons using a
memcmp() rather than per-datatype comparison code?



-- 
Peter Geoghegan




Memory-comparable Serialization of Data Types

2020-02-11 Thread Shichao Jin
Hi Postgres Developers,

We are currently integrating LSM-tree based storage engine RocksDB into
Postgres. I am wondering is there any function that serialize data types in
memory-comparable format, similar to MySQL and MariaDB. With that kind of
function, we can directly store the serialized format in the storage engine
and compare them in the storage engine level instead of deserializing data
and comparing in the upper level. I know PostgreSQL is towards supporting
pluggble storage engine, so I think this feature would be particular useful.

Best,
Shichao


Re: Portal->commandTag as an enum

2020-02-11 Thread Alvaro Herrera
On 2020-Feb-07, Mark Dilger wrote:

> Andres,
> 
> The previous patch set seemed to cause confusion, having separated
> changes into multiple files.  The latest patch, heavily influenced by
> your review, is all in one file, attached.

Cool stuff.

I think is a little confused about what is source and what is generated.
I'm not clear why commandtag.c is a C file at all; wouldn't it be
simpler to have it as a Data::Dumper file or some sort of Perl struct,
so that it can be read easily by the Perl file?  Similar to the
src/include/catalog/pg_*.dat files.  That script can then generate all
the needed .c and .h files, which are not going to be part of the source
tree, and will always be in-sync and won't have the formatting
strictness about it.  And you won't have the Martian syntax you had to
use in the commandtag.c file.

As for API, please don't shorten things such as SetQC, just use
SetQueryCompletion.  Perhaps call it SetCompletionTag or SetCommandTag?.
(I'm not sure about the name "QueryCompletionData"; maybe CommandTag or
QueryTag would work better for that struct.  There seems to be a lot of
effort in separating QueryCompletion from CommandTag; is that really
necessary?)  Lastly, we have a convention that we have a struct called
FooData, and a typedef FooData *Foo, then use the latter in the API.
We don't adhere to that 100%, and some people dislike it, but I'd rather
be consistent and not be passing "FooData *" around; it's just noisier.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Fetching timeline during recovery

2020-02-11 Thread Jehan-Guillaume de Rorthais
On Fri, 31 Jan 2020 15:12:30 +0900
Michael Paquier  wrote:

> On Thu, Jan 23, 2020 at 05:54:08PM +0100, Jehan-Guillaume de Rorthais wrote:
> > Please find the new version of the patch in attachment.
> 
> To be honest, I find the concept of this patch confusing.
> pg_stat_wal_receiver is just a one-one mapping with the shared memory
> state of the WAL receiver itself and show data *if and only if* a WAL
> receiver is running and iff it is ready to display any data, so I'd
> rather not change its nature

If you are talking about the pg_stat_wal_receiver view, I don't have a strong
opinion on this anyway as I vote 0 when discussing it. My current patch
doesn't alter its nature.

> and it has nothing to do with the state of WAL being applied by the startup
> process.

Indeed, I was feeling this was a bad design to add these columns, as stated in
my last mail. So I withdraw this.

> So this gets a -1 from me.

OK.

[...]
> Isn't what you are looking for here a different system view which maps
> directly to XLogCtl so as you can retrieve the status of the applied
> WAL at recovery anytime

My main objective is received LSN/TLI. This is kept by WalRcv for streaming.
That's why pg_stat_wal_receiver was the very good place for my need. But again,
you are right, I shouldn't have add the replied bits to it.

> say pg_stat_recovery?

I finally dig this path. I was in the hope we could find something
simpler and lighter, but other solutions we studied so far (thanks all for your
time) were all discarded [1].

A new pg_stat_get_recovery() view might be useful for various monitoring
purpose. After poking around in the code, it seems the patch would be bigger
than previous solutions, so I prefer discussing the specs first. 

At a first glance, I would imagine the following columns as a minimal patch:

* source: stream, archive or pg_wal
* write/flush/replayed LSN
* write/flush/replayed TLI

This has already some heavy impact in the code. Source might be taken from
xlog.c:currentSource, so it should probably be included in XLogCtl to be
accessible from any backend.

As replayed LSN/TLI comes from XLogCtl too, we might probably need a new
dedicated function to gather these fields plus currentSource under the same
info_lck.

Next, write lsn/tli is not accessible from WalRcv, only flush. So either we do
not include it, or we would probably need to replace WalRcv->receivedUpto with
existing LogstreamResult.

Next, there's no stats about wal shipping recovery. Restoring a WAL from
archive do not increment anything about write/flush LSN/TLI. I wonder if both
wal_receiver stats and WAL shipping stats might be merged together in the same
refactored structure in shmem, as they might share a fair number of field
together? This would be pretty invasive in the code, but I feel it's heavier to
add another new struct in shmem just to track WAL shipping stats whereas WalRcv
already exists there.

Now, I think the following additional field might be useful for monitoring. But
as this is out my my original scope, I prefer discussing how useful this might
be:

* start_time: start time of the current source
* restored_count: total number of WAL restored. We might want to split this
  counter to track each method individually.
* last_received_time: last time we received something from the current source
* last_fail_time: last failure time, whatever the source

Thanks for reading up to here!

Regards,


[1] even if I still hope the pg_stat_get_wal_receiver might still gather some
more positive vote :)




Sv: Just for fun: Postgres 20?

2020-02-11 Thread Andreas Joseph Krogh

This project already tried that: 
https://www.postgresql.org/docs/12/history.html#HISTORY-POSTGRES95 
 
Didn't last long... 


--
 Andreas Joseph Krogh 

Re: Just for fun: Postgres 20?

2020-02-11 Thread Joshua Drake
>
>
> From: Jose Luis Tallon 
>
> >  Musing some other date-related things I stumbled upon the thought
> > that naming the upcoming release PostgreSQL 20 might be preferrable to
> > the current/expected "PostgreSQL 13".
>
> +1
>
> Users can easily know how old/new the release is that they are using.
>
>
There are multiple pros and cons to this idea. There is an argument since
we are on annual releases that 20 makes sense, and (14) would be 21 etc...
However, there is a significant problem with that. Our annual releases are
a relatively new thing and I can definitely see a situation in the future
where we move back to non-annual releases to a more conservative timeline.
Further, the jump of the number is going to be seen as a marketing ploy and
if we are going to be doing marketing ploys, then we should have the new
feature set to back it up upon release.

JD


Re: Change atoi to strtol in same place

2020-02-11 Thread Alvaro Herrera
On 2019-Dec-06, Joe Nelson wrote:

> I see this patch has been moved to the next commitfest. What's the next
> step, does it need another review?

This patch doesn't currently apply; it has conflicts with at least
01368e5d9da7 and 7e735035f208; even in 7e735035f208^ it applies with
fuzz.  Please post an updated version so that it can move forward.

On the other hand, I doubt that patching pg_standby is productive.  I
would just leave that out entirely.  See this thread from 2014
https://postgr.es/m/545946e9.8060...@gmx.net

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-02-11 Thread Justin Pryzby
For your v7 patch, which handles REINDEX to a new tablespace, I have a few
minor comments:

+ * the relation will be rebuilt.  If InvalidOid is used, the default

=> should say "currrent", not default ?

+++ b/doc/src/sgml/ref/reindex.sgml
+TABLESPACE
...
+new_tablespace

=> I saw you split the description of TABLESPACE from new_tablespace based on
comment earlier in the thread, but I suggest that the descriptions for these
should be merged, like:

+   
+TABLESPACEnew_tablespace
+
+ 
+  Allow specification of a tablespace where all rebuilt indexes will be 
created.
+  Cannot be used with "mapped" relations. If SCHEMA,
+  DATABASE or SYSTEM are specified, 
then
+  all unsuitable relations will be skipped and a single 
WARNING
+  will be generated.
+ 
+
+   

The existing patch is very natural, especially the parts in the original patch
handling vacuum full and cluster.  Those were removed to concentrate on
REINDEX, and based on comments that it might be nice if ALTER handled CLUSTER
and VACUUM FULL.  On a separate thread, I brought up the idea of ALTER using
clustered order.  Tom pointed out some issues with my implementation, but
didn't like the idea, either.

So I suggest to re-include the CLUSTER/VAC FULL parts as a separate 0002 patch,
the same way they were originally implemented.

BTW, I think if "ALTER" were updated to support REINDEX (to allow multiple
operations at once), it might be either:
|ALTER INDEX i SET TABLESPACE , REINDEX -- to reindex a single index on a given 
tlbspc
or
|ALTER TABLE tbl REINDEX USING INDEX TABLESPACE spc; -- to reindex all inds on 
table inds moved to a given tblspc
"USING INDEX TABLESPACE" is already used for ALTER..ADD column/table CONSTRAINT.

-- 
Justin




Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Ashutosh Bapat
On Tue, Feb 11, 2020 at 8:27 AM Andy Fan  wrote:

>
>
> On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
> ashutosh.bapat@gmail.com> wrote:
>
>>
>>
>>>
>>> [PATCH] Erase the distinctClause if the result is unique by
>>>  definition
>>>
>>
>> I forgot to mention this in the last round of comments. Your patch was
>> actually removing distictClause from the Query structure. Please avoid
>> doing that. If you remove it, you are also removing the evidence that this
>> Query had a DISTINCT clause in it.
>>
>
> Yes, I removed it because it is the easiest way to do it.  what is the
> purpose of keeping the evidence?
>

Julien's example provides an explanation for this. The Query structure is
serialised into a view definition. Removing distinctClause from there means
that the view will never try to produce unique results.

>
>


>
> Suppose after a DDL, the prepared statement need to be re-parsed/planned
> if it is not executed or it will prevent the DDL to happen.
>

The query will be replanned. I am not sure about reparsed though.


>
>
> -- session 2
> postgres=# alter table t alter column b drop not null;
> ALTER TABLE
>
> -- session 1:
> postgres=# explain execute st(1);
>  QUERY PLAN
> -
>  Unique  (cost=1.03..1.04 rows=1 width=4)
>->  Sort  (cost=1.03..1.04 rows=1 width=4)
>  Sort Key: b
>  ->  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
>Filter: (c = $1)
> (5 rows)
>

Since this prepared statement is parameterised PostgreSQL is replanning it
every time it gets executed. It's not using a stored prepared plan. Try
without parameters. Also make sure that a prepared plan is used for
execution and not a new plan.
--
Best Wishes,
Ashutosh Bapat


Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Ashutosh Bapat
On Mon, Feb 10, 2020 at 10:57 PM Tom Lane  wrote:

> Ashutosh Bapat  writes:
> >> On Sat, Feb 8, 2020 at 12:53 PM Andy Fan 
> wrote:
> >> Do you mean adding some information into PlannerInfo,  and when we
> create
> >> a node for Unique/HashAggregate/Group,  we can just create a dummy node?
>
> > Not so much as PlannerInfo but something on lines of PathKey. See PathKey
> > structure and related code. What I envision is PathKey class is also
> > annotated with the information whether that PathKey implies uniqueness.
> > E.g. a PathKey derived from a Primary index would imply uniqueness also.
> A
> > PathKey derived from say Group operation also implies uniqueness. Then
> just
> > by looking at the underlying Path we would be able to say whether we need
> > Group/Unique node on top of it or not. I think that would make it much
> > wider usecase and a very useful optimization.
>
> FWIW, that doesn't seem like a very prudent approach to me, because it
> confuses sorted-ness with unique-ness.  PathKeys are about sorting,
> but it's possible to have uniqueness guarantees without having sorted
> anything, for instance via hashed grouping.
>

> I haven't looked at this patch, but I'd expect it to use infrastructure
> related to query_is_distinct_for(), and that doesn't deal in PathKeys.
>
> Thanks for the pointer. I think there's another problem with my approach.
PathKeys are specific to paths since the order of the result depends upon
the Path. But uniqueness is a property of the result i.e. relation and thus
should be attached to RelOptInfo as query_is_distinct_for() does. I think
uniquness should bubble up the RelOptInfo tree, annotating each RelOptInfo
with the minimum set of TLEs which make the result from that relation
unique. Thus we could eliminate extra Group/Unique node if the underlying
RelOptInfo's unique column set is subset of required uniqueness.
-- 
--
Best Wishes,
Ashutosh Bapat


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Tom Lane
I took a brief look through this patch.  I agree with the fundamental
idea that we shouldn't need to use the heavyweight lock manager for
relation extension, since deadlock is not a concern and no backend
should ever need to hold more than one such lock at once.  But it feels
to me like this particular solution is rather seriously overengineered.
I would like to suggest that we do something similar to Robert Haas'
excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
that is,

* Create some predetermined number N of LWLocks for relation extension.
* When we want to extend some relation R, choose one of those locks
  (say, R's relfilenode number mod N) and lock it.

1. As long as all backends agree on the relation-to-lock mapping, this
provides full security against concurrent extensions of the same
relation.

2. Occasionally a backend will be blocked when it doesn't need to be,
because of false sharing of a lock between two relations that need to
be extended at the same time.  But as long as N is large enough (and
I doubt that it needs to be very large), that will be a negligible
penalty.

3. Aside from being a lot simpler than the proposed extension_lock.c,
this approach involves absolutely negligible overhead beyond the raw
LWLockAcquire and LWLockRelease calls.  I suspect therefore that in
typical noncontended cases it will be faster.  It also does not require
any new resource management overhead, thus eliminating this patch's
small but real penalty on transaction exit/cleanup.

We'd need to do a bit of performance testing to choose a good value
for N.  I think that with N comparable to MaxBackends, the odds of
false sharing being a problem would be quite negligible ... but it
could be that we could get away with a lot less than that.

regards, tom lane




Re: Add %x to PROMPT1 and PROMPT2

2020-02-11 Thread Robert Haas
On Sun, Feb 9, 2020 at 7:45 PM Michael Paquier  wrote:
> On Mon, Feb 10, 2020 at 12:16:44AM +0100, Vik Fearing wrote:
> > There is a little bit of overlap within those three groups but among the
> > minuscule percentage of our users that responded, the result is
> > overwhelmingly in favor of this change.
>
> Thanks Vik for handling that.  So, it seems to me that we have a
> conclusion here.  Any last words?

No objections here. I'm glad that we put in the effort to get more
opinions, but I agree that an overall vote of ~58 to ~8 is a pretty
strong consensus.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: open-source equivalent of golden-gate

2020-02-11 Thread Thomas Kellerer
ROS Didier schrieb am 11.02.2020 um 11:23:
> In the Oracle world we use the product "golden gate" to execute
> transactions from a source database (Oracle, Mysql) to a PostgreSQL
> instance.
>
> This allows 2 Oracle and PostgreSQL databases to be updated at the
> same time in real time.
>
> I would like to know if there is an equivalent open-source product.
>
> Thanks in advance
>
> Best Regards
> Didier ROS

The closest solutions to golden gate are probably

* https://debezium.io/
* https://www.symmetricds.org/

Thomas




Re: open-source equivalent of golden-gate

2020-02-11 Thread Chapman Flack
On 02/11/20 07:51, Victor Yegorov wrote:
> вт, 11 февр. 2020 г. в 12:23, ROS Didier :
> 
>> In the Oracle world we use the product "golden gate" to execute
>> transactions from a source database (Oracle, Mysql) to a PostgreSQL
>> instance.
> 
> Note, that PostgreSQL provides only infrastructure, wrappers for different
> remote systems are not supported by the PostgreSQL community,
> except for postgres_fdw and csv_fdw provided by the project.

I read the question as perhaps concerning the other direction, whether
there might be an open source foreign data wrapper installable in Oracle
for talking to PostgreSQL (which might, I suppose, also have a name like
"postgres_fdw", which helps explain the number of times I've rewritten
this sentence trying to make it unambiguous).

Regards,
-Chap




Re: open-source equivalent of golden-gate

2020-02-11 Thread Victor Yegorov
вт, 11 февр. 2020 г. в 12:23, ROS Didier :

> In the Oracle world we use the product "golden gate" to execute
> transactions from a source database (Oracle, Mysql) to a PostgreSQL
> instance.
>
> This allows 2 Oracle and PostgreSQL databases to be updated at the same
> time in real time.
>
> I would like to know if there is an equivalent open-source product.
>

There is a SQL/MED standard exactly for this:
https://wiki.postgresql.org/wiki/SQL/MED

Implemented in PostgreSQL as Foreign Data Wrappers:
https://wiki.postgresql.org/wiki/Fdw
You need to do the following:
1. Add wrapper via
https://www.postgresql.org/docs/current/sql-createextension.html
2. Create remote source via
https://www.postgresql.org/docs/current/sql-createserver.html
3. Create foreign table via
https://www.postgresql.org/docs/current/sql-createforeigntable.html

Note, that PostgreSQL provides only infrastructure, wrappers for different
remote systems are not supported by the PostgreSQL community,
except for postgres_fdw and csv_fdw provided by the project.


-- 
Victor Yegorov


Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Julien Rouhaud
On Tue, Feb 11, 2020 at 08:14:14PM +0800, Andy Fan wrote:
> On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud  wrote:
> 
> > >
> > > and if we prepare sql outside a transaction, and execute it in the
> > > transaction, the other session can't drop the constraint until the
> > > transaction is ended.
> >
> > And what if you create a view on top of a query containing a distinct
> > clause
> > rather than using prepared statements?  FWIW your patch doesn't handle such
> > case at all, without even needing to drop constraints:
> >
> > CREATE TABLE t (a int primary key, b int not null,  c int);
> > INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
> > CREATE UNIQUE INDEX t_idx1 on t(b);
> > CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
> > EXPLAIN SELECT * FROM v1;
> > server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> >
> >
> This error can be fixed with
> 
> -   num_of_rtables = bms_num_members(non_semi_anti_relids);
> +   num_of_rtables = list_length(query->rtable);
> 
> This test case also be added into the patch.
> 
> 
> > I also think this is not the right way to handle this optimization.
> >
> 
> do you have any other concerns?

Yes, it seems to be broken as soon as you alter the view's underlying table:

=# CREATE TABLE t (a int primary key, b int not null,  c int);
CREATE TABLE

=# INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
INSERT 0 2

=# CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE INDEX

=# CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
CREATE VIEW

=# EXPLAIN SELECT * FROM v1;
   QUERY PLAN
-
 Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
   QUERY PLAN
-
 Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# ALTER TABLE t ALTER COLUMN b DROP NOT NULL;
ALTER TABLE

=# EXPLAIN SELECT * FROM v1;
   QUERY PLAN
-
 Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
 QUERY PLAN
-
 Unique  (cost=1.03..1.04 rows=2 width=4)
   ->  Sort  (cost=1.03..1.03 rows=2 width=4)
 Sort Key: b
 ->  Seq Scan on t  (cost=0.00..1.02 rows=2 width=4)
(4 rows)





Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Andy Fan
On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud  wrote:

> >
> > and if we prepare sql outside a transaction, and execute it in the
> > transaction, the other session can't drop the constraint until the
> > transaction is ended.
>
> And what if you create a view on top of a query containing a distinct
> clause
> rather than using prepared statements?  FWIW your patch doesn't handle such
> case at all, without even needing to drop constraints:
>
> CREATE TABLE t (a int primary key, b int not null,  c int);
> INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
> CREATE UNIQUE INDEX t_idx1 on t(b);
> CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
> EXPLAIN SELECT * FROM v1;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
>
>
This error can be fixed with

-   num_of_rtables = bms_num_members(non_semi_anti_relids);
+   num_of_rtables = list_length(query->rtable);

This test case also be added into the patch.


> I also think this is not the right way to handle this optimization.
>

do you have any other concerns?


0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patch
Description: Binary data


custom postgres launcher for tests

2020-02-11 Thread Ivan Taranov
This patch allow to use custom postgres launcher for tests (tap)
by setting  environment variable PGLAUNCHER.

Other known methods (like: https://wiki.postgresql.org/wiki/Valgrind)
requires
to perform installation, build system modifications, executable replacement
etc...

And proposed way is simpler and more flexible.


** Use-case: run checks under Valgrind


- prepare launcher

echo 'exec valgrind postgres "$@"' > /tmp/pgvalgrind
chmod +x /tmp/pgvalgrind

- execute regress tests under Valgrind

PGLAUNCHER=/tmp/pgvalgrind TESTS=gin make check-tests

- execute concrete tap-test under Valgrind

PGLAUNCHER=/tmp/pgvalgrind PROVE_TESTS=t/001_stream_rep.pl make \
check -C src/test/recovery


** Use-case: execute tests with different postgres versions


- prepare multi-launcher

cat < /tmp/launcher
cp -f `pwd`/src/backend/postgres.v*  \`pg_config --bindir\`
exec postgres.v\$V "\$@"
EOF
chmod +x /tmp/launcher

- make some versions of postgres binary

./configure "CFLAGS=..." && make
mv src/backend/postgres src/backend/postgres.v1

./configure "CFLAGS=..." && make
mv src/backend/postgres src/backend/postgres.v2

- run checks with different postgres binaries

PGLAUNCHER=/tmp/launcher V=1  make check -C contrib/bloom
PGLAUNCHER=/tmp/launcher V=2  make check -C contrib/bloom
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 9575268bd7..6c183be3a2 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -751,6 +751,7 @@ sub start
 	my $port   = $self->port;
 	my $pgdata = $self->data_dir;
 	my $name   = $self->name;
+	my @launcher = defined $ENV{PGLAUNCHER} ? ('-p', $ENV{PGLAUNCHER}) : ();
 	my $ret;
 
 	BAIL_OUT("node \"$name\" is already running") if defined $self->{_pid};
@@ -767,7 +768,7 @@ sub start
 		# Note: We set the cluster_name here, not in postgresql.conf (in
 		# sub init) so that it does not get copied to standbys.
 		$ret = TestLib::system_log('pg_ctl', '-D', $self->data_dir, '-l',
-			$self->logfile, '-o', "--cluster-name=$name", 'start');
+			$self->logfile, '-o', "--cluster-name=$name", @launcher, 'start');
 	}
 
 	if ($ret != 0)
@@ -867,6 +868,7 @@ sub restart
 	my $pgdata  = $self->data_dir;
 	my $logfile = $self->logfile;
 	my $name= $self->name;
+	my @launcher = defined $ENV{PGLAUNCHER} ? ('-p', $ENV{PGLAUNCHER}) : ();
 
 	print "### Restarting node \"$name\"\n";
 
@@ -875,7 +877,7 @@ sub restart
 		delete $ENV{PGAPPNAME};
 
 		TestLib::system_or_bail('pg_ctl', '-D', $pgdata, '-l', $logfile,
-			'restart');
+			@launcher, 'restart');
 	}
 
 	$self->_update_pid(1);
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 92bd28dc5a..8104db30f0 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -2290,6 +2290,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 		FILE	   *pg_conf;
 		const char *env_wait;
 		int			wait_seconds;
+		const char *env_launcher;
 
 		/*
 		 * Prepare the temp instance
@@ -2423,15 +2424,26 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 		 * Start the temp postmaster
 		 */
 		header(_("starting postmaster"));
-		snprintf(buf, sizeof(buf),
- "\"%s%spostgres\" -D \"%s/data\" -F%s "
- "-c \"listen_addresses=%s\" -k \"%s\" "
- "> \"%s/log/postmaster.log\" 2>&1",
- bindir ? bindir : "",
- bindir ? "/" : "",
- temp_instance, debug ? " -d 5" : "",
- hostname ? hostname : "", sockdir ? sockdir : "",
- outputdir);
+		env_launcher = getenv("PGLAUNCHER");
+		if (env_launcher != NULL)
+			snprintf(buf, sizeof(buf),
+	"\"%s\" -D \"%s/data\" -F%s "
+	"-c \"listen_addresses=%s\" -k \"%s\" "
+	"> \"%s/log/postmaster.log\" 2>&1",
+	env_launcher,
+	temp_instance, debug ? " -d 5" : "",
+	hostname ? hostname : "", sockdir ? sockdir : "",
+	outputdir);
+		else
+			snprintf(buf, sizeof(buf),
+	"\"%s%spostgres\" -D \"%s/data\" -F%s "
+	"-c \"listen_addresses=%s\" -k \"%s\" "
+	"> \"%s/log/postmaster.log\" 2>&1",
+	bindir ? bindir : "",
+	bindir ? "/" : "",
+	temp_instance, debug ? " -d 5" : "",
+	hostname ? hostname : "", sockdir ? sockdir : "",
+	outputdir);
 		postmaster_pid = spawn_process(buf);
 		if (postmaster_pid == INVALID_PID)
 		{


open-source equivalent of golden-gate

2020-02-11 Thread ROS Didier
Hi
In the Oracle world we use the product "golden gate" to execute transactions 
from a source database (Oracle, Mysql) to a PostgreSQL instance.
This allows 2 Oracle and PostgreSQL databases to be updated at the same time in 
real time.
I would like to know if there is an equivalent open-source product.

Thanks in advance

Best Regards
Didier ROS
EDF




Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


Re: Add PostgreSQL home page to --help output

2020-02-11 Thread Daniel Gustafsson
> On 11 Feb 2020, at 08:41, Peter Eisentraut  
> wrote:

> Autoconf already has a way to register the package home page and propagate 
> it, so I used that.  That also makes it easier to change it (see http: -> 
> https:) or have third parties substitute their own contact information 
> without destroying translations.

+1, this change has the side benefit of aiding postgres forks who otherwise
have to patch all occurrences to avoid getting reports on the wrong list.

> While at it, I also did the same refactoring for the bug reporting address 
> (which was also recently changed, so this is a bit late, but who knows what 
> the future holds).

Pardon my weak autoconf-skills, what does the inverted brackets (]foo[ as
opposed to [foo]) do in the below?

-Please also contact  to see about
+Please also contact <]AC_PACKAGE_BUGREPORT[> to see about

cheers ./daniel



Re: Internal key management system

2020-02-11 Thread David Fetter
On Mon, Feb 10, 2020 at 05:57:47PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2020-02-08 12:07:57 -0500, Tom Lane wrote:
> > For the same reason, I don't think that an "internal key management"
> > feature in the core code is ever going to be acceptable.  It has to
> > be an extension.  (But, as long as it's an extension, whether it's
> > bringing its own crypto or relying on some other extension for that
> > doesn't matter from the legal standpoint.)
> 
> I'm not convinced by that. We have optional in-core functionality that
> requires external libraries, and we add more cases, if necessary.

Take for example libreadline, without which our CLI is at best
dysfunctional.

> > > Sure, I know the code is currently calling ooenssl functions. I
> > > was responding to Masahiko-san's message that we might
> > > eventually merge this openssl code into our tree.
> > 
> > No.  This absolutely, positively, will not happen.  There will
> > never be crypto functions in our core tree, because then there'd
> > be no recourse for people who want to use Postgres in countries
> > with restrictions on crypto software.  It's hard enough for them
> > that we have such code in contrib --- but at least they can remove
> > pgcrypto and be legal.  If it's in src/common then they're stuck
> 
> Isn't that basically a problem of the past by now?
> 
> Partially due to changed laws (e.g. France, which used to be a
> problematic case),

It's less of a problem than it once was, but it's not exactly gone yet.
https://en.wikipedia.org/wiki/Restrictions_on_the_import_of_cryptography

> but also because it's basically futile to have
> import restrictions on cryptography by now. Just about every larger
> project contains significant amounts of cryptographic code and it's
> entirely impractical to operate anything interfacing with network
> without some form of transport encryption.  And just about all open
> source distribution mechanism have stopped separating out crypto
> code a long time ago.

That's true.  We have access to legal counsel. Maybe it's worth asking
them how best to include cryptographic functionality, "how" being the
question one asks when one wants to get a positive response.

> I however do agree that we should strive to not introduce
> cryptographic code into the pg source tree - nobody here seems to
> have even close to enough experience to maintaining / writing that.

+1 for not turning ourselves into implementers of cryptographic
primitives.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Andy Fan
On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud  wrote:

> On Tue, Feb 11, 2020 at 10:57:26AM +0800, Andy Fan wrote:
> > On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
> > ashutosh.bapat@gmail.com> wrote:
> >
> > > I forgot to mention this in the last round of comments. Your patch was
> > > actually removing distictClause from the Query structure. Please avoid
> > > doing that. If you remove it, you are also removing the evidence that
> this
> > > Query had a DISTINCT clause in it.
> > >
> >
> > Yes, I removed it because it is the easiest way to do it.  what is the
> > purpose of keeping the evidence?
> >
> > >> However the patch as presented has some problems
> > >> 1. What happens if the primary key constraint or NOT NULL constraint
> gets
> > >> dropped between a prepare and execute? The plan will no more be valid
> and
> > >> thus execution may produce non-distinct results.
> >
> > > But that doesn't matter since a query can be prepared outside a
> > > transaction and executed within one or more subsequent transactions.
> > >
> >
> > Suppose after a DDL, the prepared statement need to be re-parsed/planned
> > if it is not executed or it will prevent the DDL to happen.
> >
> > The following is my test.
> >
> > postgres=# create table t (a int primary key, b int not null,  c int);
> > CREATE TABLE
> > postgres=# insert into t values(1, 1, 1), (2, 2, 2);
> > INSERT 0 2
> > postgres=# create unique index t_idx1 on t(b);
> > CREATE INDEX
> >
> > postgres=# prepare st as select distinct b from t where c = $1;
> > PREPARE
> > postgres=# explain execute st(1);
> >QUERY PLAN
> > -
> >  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
> >Filter: (c = 1)
> > (2 rows)
> > ...
> > postgres=# explain execute st(1);
> >QUERY PLAN
> > -
> >  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
> >Filter: (c = $1)
> > (2 rows)
> >
> > -- session 2
> > postgres=# alter table t alter column b drop not null;
> > ALTER TABLE
> >
> > -- session 1:
> > postgres=# explain execute st(1);
> >  QUERY PLAN
> > -
> >  Unique  (cost=1.03..1.04 rows=1 width=4)
> >->  Sort  (cost=1.03..1.04 rows=1 width=4)
> >  Sort Key: b
> >  ->  Seq Scan on t  (cost=0.00..1.02 rows=1 width=4)
> >Filter: (c = $1)
> > (5 rows)
> >
> > -- session 2
> > postgres=# insert into t values (3, null, 3), (4, null, 3);
> > INSERT 0 2
> >
> > -- session 1
> > postgres=# execute st(3);
> >  b
> > ---
> >
> > (1 row)
> >
> > and if we prepare sql outside a transaction, and execute it in the
> > transaction, the other session can't drop the constraint until the
> > transaction is ended.
>
> And what if you create a view on top of a query containing a distinct
> clause
> rather than using prepared statements?  FWIW your patch doesn't handle such
> case at all, without even needing to drop constraints:


>
CREATE TABLE t (a int primary key, b int not null,  c int);
> INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
> CREATE UNIQUE INDEX t_idx1 on t(b);
> CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
> EXPLAIN SELECT * FROM v1;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
>
>
Thanks for pointing it out.  This is unexpected based on my current
knowledge,I
will check that.


> I also think this is not the right way to handle this optimization.
>

I started to check query_is_distinct_for when Tom point it out,  but still
doesn't
understand the context fully.   I will take your finding with this as well.


client-side fsync() error handling

2020-02-11 Thread Peter Eisentraut
Continuing the discussion from [0], there are still a number of fsync() 
calls in client programs that are unchecked or where errors are treated 
non-fatally.


Digging around through the call stack, I think changing fsync_fname() to 
just call exit(1) on errors instead of proceeding would address most of 
that.


This affects (at least) initdb, pg_basebackup, pg_checksums, pg_dump, 
pg_dumpall, and pg_rewind.


Thoughts?


[0]: 
https://www.postgresql.org/message-id/flat/9b49fe44-8f3e-eca9-5914-29e9e99030bf%402ndquadrant.com


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From dfe2140d34d3c50b2908556c4627d77d009e487f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 11 Feb 2020 09:07:59 +0100
Subject: [PATCH] Change client-side fsync_fname() to report errors fatally

Given all we have learned about fsync() error handling in the last few
years, reporting an fsync() error non-fatally is not useful,
unless you don't care much about the file, in which case you probably
don't need to use fsync() in the first place.

Change fsync_fname() to exit(1) on fsync() errors other than those
that we specifically chose to ignore.

This affects initdb, pg_basebackup, pg_checksums, pg_dump, pg_dumpall,
and pg_rewind.
---
 src/common/file_utils.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/common/file_utils.c b/src/common/file_utils.c
index 413fe4eeb1..078980a954 100644
--- a/src/common/file_utils.c
+++ b/src/common/file_utils.c
@@ -51,8 +51,6 @@ static void walkdir(const char *path,
  * fsyncing, and might not have privileges to write at all.
  *
  * serverVersion indicates the version of the server to be fsync'd.
- *
- * Errors are reported but not considered fatal.
  */
 void
 fsync_pgdata(const char *pg_data,
@@ -250,8 +248,8 @@ pre_sync_fname(const char *fname, bool isdir)
  * fsync_fname -- Try to fsync a file or directory
  *
  * Ignores errors trying to open unreadable files, or trying to fsync
- * directories on systems where that isn't allowed/required.  Reports
- * other errors non-fatally.
+ * directories on systems where that isn't allowed/required.  All other errors
+ * are fatal.
  */
 int
 fsync_fname(const char *fname, bool isdir)
@@ -294,9 +292,9 @@ fsync_fname(const char *fname, bool isdir)
 */
if (returncode != 0 && !(isdir && (errno == EBADF || errno == EINVAL)))
{
-   pg_log_error("could not fsync file \"%s\": %m", fname);
+   pg_log_fatal("could not fsync file \"%s\": %m", fname);
(void) close(fd);
-   return -1;
+   exit(EXIT_FAILURE);
}
 
(void) close(fd);
-- 
2.25.0