Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Kyotaro Horiguchi
At Fri, 5 Mar 2021 10:07:06 +0500, Andrey Borodin  wrote 
in 
> Maybe it's offtopic here, but anyway...
> While working on "lz4 for FPIs" I've noticed that this test fails with 
> wal_compression = on.
> I did not investigate the case at that moment, but I think that it would be 
> good to run recovery regression tests with compression too.

The problem records has 15 pages of FPIs.  Reducing of its size may
prevent WAL-buffer wrap around and wal writes.  If no wal retten the
test fails.

regrds.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Kyotaro Horiguchi
At Thu, 04 Mar 2021 23:40:34 -0500, Tom Lane  wrote in 
> BTW, I tried simply removing the "allows_streaming" option from
> the test, and it failed ten times out of ten tries for me.
> So Andres is right that that makes it pretty reproducible in
> a stock build.

The difference comes from the difference of shared_buffers. In the
"allows_streaming" case, PostgresNode::init() *reduces* the number
down to '1MB'(128 blocks) which leads to only 8 XLOGbuffers, which
will very soon be wrap-arounded, which leads to XLogWrite.

When allows_streaming=0 case, the default size of shared_buffers is
128MB (16384 blocks).  WAL buffer (512) doesn't get wrap-arounded
during the test and no WAL buffer is written out in that case.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Which PG version does CVE-2021-20229 affected?

2021-03-04 Thread Michael Paquier
On Fri, Mar 05, 2021 at 12:32:43AM -0700, bchen90 wrote:
> NVD link:
> 
> https://nvd.nist.gov/vuln/detail/CVE-2021-20229#vulnCurrentDescriptionTitle

This link includes incorrect information.  CVE-2021-20229 is only a
problem in 13.0 and 13.1, fixed in 13.2.  Please see for example here:
https://www.postgresql.org/support/security/

The commit that fixed the issue is c028faf, mentioning 9ce77d7 as the
origin point, a commit introduced in Postgres 13.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-04 Thread Michael Paquier
On Fri, Mar 05, 2021 at 01:30:11PM +0900, Fujii Masao wrote:
>   if ((env = getenv("PGDATABASE")) != NULL && *env != '\0')
>   dbName = env;
> - else if (login != NULL && *login != '\0')
> - dbName = login;
> + else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
> + dbName = env;
>   else
> - dbName = "";
> + dbName = get_user_name_or_exit(progname);
> 
> If dbName is set by libpq, the above also is not necessary?

If you remove this part, pgbench loses some log information if
PQconnectdbParams() returns NULL, like on OOM if the database name is
NULL.  Perhaps that's not worth caring about here for a single log
failure, but that's the reason is why I left this part around.

Now, simplifying the code is one goal of this patch, so I would not
mind shaving a bit more of it :)
--
Michael


signature.asc
Description: PGP signature


Which PG version does CVE-2021-20229 affected?

2021-03-04 Thread bchen90
Hi, all

Recently, I retrieved CVE-2021-20229 on the NVD website which describes
the affected PG version are "before 13.2, before 12.6, before 11.11, before
10.16, before 9.6.21 and before 9.5.25", but we I look the official website
of PG and look the git commit log, I found only 13 version is affect. So I
confused?

Best regards


NVD link:

https://nvd.nist.gov/vuln/detail/CVE-2021-20229#vulnCurrentDescriptionTitle



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Disallow SSL compression?

2021-03-04 Thread Michael Paquier
On Thu, Mar 04, 2021 at 11:52:56PM +0100, Daniel Gustafsson wrote:
> The attached version takes a step further and removes sslcompression from
> pg_conn and just eats the value as there is no use in setting a dummy alue.  
> It
> also removes compression from PgBackendSSLStatus and be_tls_get_compression as
> raised by Michael downthread.  I opted for keeping the column in pg_stat_ssl
> with a note in the documentation that it will be removed, for the same
> backwards compatibility reason of eating the connection param without acting 
> on
> it.  This might be overthinking it however.

FWIW, I would vote to nuke it from all those places, reducing a bit
pg_stat_get_activity() while on it.  Keeping it around in the system
catalogs may cause confusion IMHO, by making people think that it is
still possible to get into configurations where sslcompression could
be really enabled.  The rest of the patch looks fine to me.
--
Michael


signature.asc
Description: PGP signature


Re: n_mod_since_analyze isn't reset at table truncation

2021-03-04 Thread Julien Rouhaud
On Fri, Mar 05, 2021 at 03:31:33PM +0900, Fujii Masao wrote:
> 
> On 2021/03/04 12:40, Julien Rouhaud wrote:
> > > In that case, conversely, we want to trigger autoanalyze ASAP because the 
> > > contents in the table was changed very much?
> > 
> > We might want, but wouldn't keeping the current n_mod_since_analyze would 
> > make
> > things unpredictable?
> 
> I don't think so. autoanalyze still works based on the number of
> modifications since last analyze, so ISTM that's predictable...

If we keep n_mod_since_analyze, autoanalyze might trigger after the first write
or might wait for a full cycle of writes, depending on the kept value.  So yes
it can make autoanalyze triggers earlier in some cases, but that's not
predictable from the TRUNCATE even point of view.

> > Also the selectivity estimation functions already take into account the 
> > actual
> > relation size, so the estimates aren't completely off after a truncate.
> 
> But the statistics is out of date and which may affect the planning badly?
> Also if generic plan is used, it's not updated until next analyze, i.e.,
> generic plan created based on old statistics is used for a while.
> 
> So I'm still not sure why you want to defer autoanalyze in that case.

I don't especially want to defer autoanalyze in that case.  But an autoanalyze
happening quickly after a TRUNCATE is critical for performance, I'd prefer to
find a way to trigger autoanalyze reliably.




RE: Avoid CommandCounterIncrement in RI trigger when INSERT INTO referencing table

2021-03-04 Thread houzj.f...@fujitsu.com
> > >  I'm worried about having this dependency in RI check, because the
> > > planner
> > may allow parallel INSERT in these cases in the future.
> >
> > If we support parallel insert that can have other modifications in the
> > future, I think we will also be able to share or increment command ID
> > in parallel wokers in that time.
> > And it seems we can remove this dependency in that time.
> > How about add some more comments about this to remind future
> developer.
> > /*
> >  * If extra parallel modification is support in the future, this
> > dependency should be removed.
> >  */
> 
> Agreed.
> 
> I'm excited to see PostgreSQL's parallel DML work in wider use cases and
> satisfy users' expectations.

Thanks !
Attaching the first version patch which avoid CCI in RI trigger when insert 
into referencing table.

Best regards,
houzj


0001-Avoid-CCI-in-RI-trigger-when-INSERT-fk-relation.patch
Description:  0001-Avoid-CCI-in-RI-trigger-when-INSERT-fk-relation.patch


RE: Avoid CommandCounterIncrement in RI trigger when INSERT INTO referencing table

2021-03-04 Thread tsunakawa.ta...@fujitsu.com
From: Hou, Zhijie/侯 志杰 
> From the wiki[1], CCI is to let statements can not see the rows they modify.
> 
> Here is an example of the case 1):
> (Note table referenced and referencing are both empty)
> -
> postgres=# with cte as (insert into referenced values(1)) insert into
> referencing values(1);
> -
> The INSERT here will first modify the referenced table(pk table) and then
> modify the referencing table.
> When modifying the referencing table, it has to check if the tuple to be 
> insert
> exists in referenced table.
> At this moment, CCI can let the modification on referenced in WITH visible
> when check the existence of the tuple.
> If we do not CCI here, postgres will report an constraint error because it 
> can not
> find the tuple in referenced table.

Ah, got it.  Thank you.  I'd regret if Postgres has to give up parallel 
execution because of SQL statements that don't seem to be used like the above.


> >  I'm worried about having this dependency in RI check, because the planner
> may allow parallel INSERT in these cases in the future.
> 
> If we support parallel insert that can have other modifications in the 
> future, I
> think we will also be able to share or increment command ID in parallel wokers
> in that time.
> And it seems we can remove this dependency in that time.
> How about add some more comments about this to remind future developer.
> /*
>  * If extra parallel modification is support in the future, this dependency 
> should
> be removed.
>  */

Agreed.

I'm excited to see PostgreSQL's parallel DML work in wider use cases and 
satisfy users' expectations.


Regards
Takayuki Tsunakawa





Re: n_mod_since_analyze isn't reset at table truncation

2021-03-04 Thread Fujii Masao




On 2021/03/04 12:40, Julien Rouhaud wrote:

On Thu, Mar 04, 2021 at 12:21:14PM +0900, Fujii Masao wrote:



On 2021/03/04 11:24, Julien Rouhaud wrote:

On Thu, Mar 04, 2021 at 10:35:19AM +0900, Masahiko Sawada wrote:


While reviewing "autoanalyze on partitioned table" patch, I realized
that pg_stat_xxx_tables.n_mod_since_analyze is not reset at TRUNCATE.
On the other hand, n_ins_since_vacuum is reset. I think it should be
reset because otherwise we end up unnecessarily triggering autoanalyze
even when the actual number of newly inserted tuples is less than the
autoanalyze thresholds.


In that case, conversely, we want to trigger autoanalyze ASAP because the 
contents in the table was changed very much?


We might want, but wouldn't keeping the current n_mod_since_analyze would make
things unpredictable?


I don't think so. autoanalyze still works based on the number of
modifications since last analyze, so ISTM that's predictable...


Also the selectivity estimation functions already take into account the actual
relation size, so the estimates aren't completely off after a truncate.


But the statistics is out of date and which may affect the planning badly?
Also if generic plan is used, it's not updated until next analyze, i.e.,
generic plan created based on old statistics is used for a while.

So I'm still not sure why you want to defer autoanalyze in that case.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




RE: libpq debug log

2021-03-04 Thread tsunakawa.ta...@fujitsu.com
From: Tom Lane 
> But I think passing the message start address explicitly might be better than
> having it understand the buffering behavior in enough detail to know where to
> find the message.  Part of the point here
> (IMO) is to decouple the tracing logic from the core libpq logic, in hopes of 
> not
> having common-mode bugs.

Ouch, you're perfectly right.  Then let's make the signature:

void pqLogMessage(PGconn *conn, const char *message, bool toServer);

Thank you!


Regards
Takayuki Tsunakawa





Re: Shared memory size computation oversight?

2021-03-04 Thread Julien Rouhaud
On Thu, Mar 04, 2021 at 12:53:54PM -0500, Tom Lane wrote:
> My conclusion is that we don't need to do anything, indeed the proposed
> changes will probably just lead to overestimation.
> 
> It's certainly possible that there's something amiss somewhere.  These
> numbers were all taken with out-of-the-box configuration, so it could be
> that changing some postgresql.conf entries would expose that some module
> is not scaling its request correctly.  Also, I don't have any extensions
> loaded, so this proves nothing about the correctness of any of those.
> But it appears to me that the general scheme is working perfectly fine,
> so we do not need to complicate it.

Thanks for looking at it.  I agree that most of the changes aren't worth the
complication and risk to overestimate the shmem size.

But I did some more experiments comparing the current version and the patched
version of the lock and pred lock shared hash table size estimations with some
configuration changes, and I find the following delta in the estimated size:

- original configuration : 15kB
- max_connection = 1000 : 30kB
- max_connection = 1000 and max_locks_per_xact = 256 : 96kB

I don't know if my version is totally wrong or not, but if it's not it could be
worthwhile to apply the hash tab part as it would mean that it doesn't scale
properly.




Re: Improvements and additions to COPY progress reporting

2021-03-04 Thread Michael Paquier
On Thu, Mar 04, 2021 at 05:45:50PM +0100, Matthias van de Meent wrote:
> Sure, I'm convinced. PFA the patchset with this change applied.

0002 looks fine to me, and in line with the discussion, so applied.
--
Michael


signature.asc
Description: PGP signature


Re: SQL-standard function body

2021-03-04 Thread Jaime Casanova
On Tue, Mar 2, 2021 at 12:13 PM Peter Eisentraut
 wrote:
>
> On 11.02.21 09:02, Peter Eisentraut wrote:
> >> Here is an updated patch to get it building again.
> >
> > Another updated patch to get things building again.  I've also fixed the
> > last TODO I had in there in qualifying function arguments as necessary
> > in ruleutils.c.
>
> Updated patch to resolve merge conflict.  No changes in functionality.

Hi,

I was making some tests with this patch and found this problem:

"""
CREATE OR REPLACE FUNCTION public.make_table()
 RETURNS void
 LANGUAGE sql
BEGIN ATOMIC
  CREATE TABLE created_table AS SELECT * FROM int8_tbl;
END;
ERROR:  unrecognized token: "?"
CONTEXT:  SQL function "make_table"
"""

Attached a backtrace from the point the error is thrown.

-- 
Jaime Casanova
Director de Servicios Profesionales
SYSTEMGUARDS - Consultores de PostgreSQL
#0  nodeRead (
token=0x55c62af712b4 "? :resultRelation 0 :hasAggs false :hasWindowFuncs 
false :hasTargetSRFs false :hasSubLinks false :hasDistinctOn false 
:hasRecursive false :hasModifyingCTE false :hasForUpdate false :hasRowSecurity 
fal"..., tok_len=1)
at read.c:421
__errno_location = 
result = 0x55c62af712b3
type = 104
__func__ = "nodeRead"
#1  0x55c62a706491 in _readQuery () at readfuncs.c:255
local_node = 0x55c62af71b20
token = 0x55c62af712a7 ":utilityStmt ? :resultRelation 0 :hasAggs false 
:hasWindowFuncs false :hasTargetSRFs false :hasSubLinks false :hasDistinctOn 
false :hasRecursive false :hasModifyingCTE false :hasForUpdate false :hasRo"...
length = 12
#2  0x55c62a711588 in parseNodeString () at readfuncs.c:2701
return_value = 0xf424308e6
token = 0x55c62af71273 "QUERY :commandType 5 :querySource 0 :canSetTag 
true :utilityStmt ? :resultRelation 0 :hasAggs false :hasWindowFuncs false 
:hasTargetSRFs false :hasSubLinks false :hasDistinctOn false :hasRecursive 
fal"...
length = 5
__func__ = "parseNodeString"
#3  0x55c62a705b74 in nodeRead (
token=0x55c62af71272 "{QUERY :commandType 5 :querySource 0 :canSetTag true 
:utilityStmt ? :resultRelation 0 :hasAggs false :hasWindowFuncs false 
:hasTargetSRFs false :hasSubLinks false :hasDistinctOn false :hasRecursive 
fa"..., tok_len=1)
at read.c:334
result = 0x55c62af71273
type = 103
__func__ = "nodeRead"
#4  0x55c62a705e8d in nodeRead (
token=0x55c62af71272 "{QUERY :commandType 5 :querySource 0 :canSetTag true 
:utilityStmt ? :resultRelation 0 :hasAggs false :hasWindowFuncs false 
:hasTargetSRFs false :hasSubLinks false :hasDistinctOn false :hasRecursive 
fa"..., tok_len=1)
at read.c:400
l = 0x0
result = 0x55c62af71272
type = 102
__func__ = "nodeRead"
#5  0x55c62a705e8d in nodeRead (
token=0x55c62af71271 "({QUERY :commandType 5 :querySource 0 :canSetTag true 
:utilityStmt ? :resultRelation 0 :hasAggs false :hasWindowFuncs false 
:hasTargetSRFs false :hasSubLinks false :hasDistinctOn false :hasRecursive 
f"..., tok_len=1)
at read.c:400
l = 0x0
result = 0x277
type = 102
__func__ = "nodeRead"
#6  0x55c62a705700 in stringToNodeInternal (
str=0x55c62af71270 "(({QUERY :commandType 5 :querySource 0 :canSetTag true 
:utilityStmt ? :resultRelation 0 :hasAggs false :hasWindowFuncs false 
:hasTargetSRFs false :hasSubLinks false :hasDistinctOn false :hasRecursive 
"..., 
restore_loc_fields=false) at read.c:74
retval = 0x7ffd9bafc9f0
save_strtok = 0x55c62af712b5 " :resultRelation 0 :hasAggs false 
:hasWindowFuncs false :hasTargetSRFs false :hasSubLinks false :hasDistinctOn 
false :hasRecursive false :hasModifyingCTE false :hasForUpdate false 
:hasRowSecurity fals"...
#7  0x55c62a705732 in stringToNode (
str=0x55c62af71270 "(({QUERY :commandType 5 :querySource 0 :canSetTag true 
:utilityStmt ? :resultRelation 0 :hasAggs false :hasWindowFuncs false 
:hasTargetSRFs false :hasSubLinks false :hasDistinctOn false :hasRecursive 
"...) at read.c:91
No locals.
#8  0x55c62a501a7d in fmgr_sql_validator (fcinfo=0x7ffd9bafcb00) at 
pg_proc.c:895
n = 0x55c62af065a0
funcoid = 23208
tuple = 0x7f95f02b9348
proc = 0x7f95f02b9390
raw_parsetree_list = 0x55c62af065a0
querytree_list = 0x55c62aa27bf2 
lc = 0x7ffd9bafcb20
isnull = false
tmp = 140281956242456
prosrc = 0x55c62af065a0 "\240e\360*\306U"
callback_arg = {proname = 0x7f95f02b9394 "make_table", prosrc = 0x0}
sqlerrcontext = {previous = 0x0, callback = 0x55c62a501c77 
, 
  arg = 0x7ffd9bafca50}
haspolyarg = false
i = 0
__func__ = "fmgr_sql_validator"
#9  0x55c62aa29c5f in FunctionCall1Coll (flinfo=0x7ffd9bafcb60, 
collation=0, arg1=23208) at fmgr.c:1141
fcinfodata = {fcinfo = {flinfo = 0x7ffd9bafcb60, context = 0x0, 
resultinfo = 0x0, fncollation = 0, isnull = 

Re: libpq debug log

2021-03-04 Thread Tom Lane
"tsunakawa.ta...@fujitsu.com"  writes:
> Oops, the logging facility needs the destination (conn->pfdebug).  So, it 
> will be:

> void pqLogMessage(PGconn *conn, bool toServer);

Right, and it'd need the debug flags too, so +1 for just passing the
PGconn instead of handing those over piecemeal.

But I think passing the message start address explicitly might be
better than having it understand the buffering behavior in enough
detail to know where to find the message.  Part of the point here
(IMO) is to decouple the tracing logic from the core libpq logic,
in hopes of not having common-mode bugs.

regards, tom lane




RE: libpq debug log

2021-03-04 Thread tsunakawa.ta...@fujitsu.com
From: tsunakawa.ta...@fujitsu.com 
> I understood that the former is pqParseInput3() and the latter is
> pqPutMsgEnd().  They call the logging function wne conn->pfdebug is not
> NULL.  Its signature is like this (that will be defined in libpq-trace.c):
> 
> void pqLogMessage(const char *message, bool toServer);

Oops, the logging facility needs the destination (conn->pfdebug).  So, it will 
be:

void pqLogMessage(PGconn *conn, bool toServer);

The function should know where to extract the message from.


Regards
Takayuki Tsunakawa





RE: libpq debug log

2021-03-04 Thread tsunakawa.ta...@fujitsu.com
Tom-san, Alvaro-san,


From: Tom Lane 
> I took a quick look through the v22 patch, and TBH I don't like much of 
> anything
> at all about the proposed architecture.  It's retained most of the flavor of 
> the
> way it was done before, which was a hangover from the old process-on-the-fly
> scheme.

Yes, the patch just follows the current style of interspersing.  (But some 
places are removed so it's a bit cleaner.)

Anyway, I think your suggestion should be better, resulting in much cleaner 
separation of message handling and logging.


> I think the right way to do this, now that we've killed off v2 protocol 
> support, is to
> rely on the fact that libpq fully buffers both incoming and outgoing messages.
> We should nuke every last bit of the existing code that intersperses tracing 
> logic
> with normal message decoding and construction, and instead have two tracing
> entry points:

FWIW, I was surprised and glad when I saw the commit message to remove the v2 
protocol.


> (1) Log a received message.  This is called as soon as we know (from the
> length word) that we have the whole message available, and it's just passed a
> pointer into the input buffer.  It should examine the input message for itself
> and print the details.
> 
> (2) Log a message-to-be-sent.  Again, call it as soon as we've constructed a
> complete message in the output buffer, and let it examine and print that
> message for itself.

I understood that the former is pqParseInput3() and the latter is 
pqPutMsgEnd().  They call the logging function wne conn->pfdebug is not NULL.  
Its signature is like this (that will be defined in libpq-trace.c):

void pqLogMessage(const char *message, bool toServer);

The message argument points to the message type byte.  toServer is true for 
messages sent to the server.  The signature could alternatively be one of the 
following, but this is a matter of taste:

void pqLogMessage(char message_type, const char *message, bool toServer);
void pqLogMessage(char message_type, int length, const char *message, bool 
toServer);

This function simply outputs the timestamp, message direction, message type, 
length, and message contents.  The output processing of message contents 
branches on a message type with a switch-case statement.  Perhaps the output 
processing of each message should be separated into its own function like 
pqLogMessageD() for 'D' message so that the indentation won't get deep in the 
main logging function.


> Both of these pieces of logic could be written directly from the protocol
> specification, without any interaction with the main libpq code paths, which
> would be a Good Thing IMO.  The current intertwined approach is somewhere
> between useless and counterproductive if you're in need of tracing down a
> libpq bug.  (In other words, I'm suggesting that the apparent need for
> duplicate logic would be good not bad, and indeed that it'd be best to write 
> the
> tracing logic without consulting the existing libpq code at all.)

Yes, the only thing I'm concerned about is to have two places that have to be 
aware of the message format -- message encoding/decoding and logging.  But this 
cleaner separation would pay it.

Alvaro-san, what do you think?  Anyway, we'll soon post the updated patch that 
fixes the repeated ParseComplete issue based on the current patch.  After that, 
we'll send a patch based on Tom-san's idea.


Regards
Takayuki Tsunakawa





Re: Get memory contexts of an arbitrary backend process

2021-03-04 Thread Fujii Masao




On 2021/03/04 18:32, torikoshia wrote:

On 2021-01-14 19:11, torikoshia wrote:

Since pg_get_target_backend_memory_contexts() waits to dump memory and
it could lead dead lock as below.

  - session1
  BEGIN; TRUNCATE t;

  - session2
  BEGIN; TRUNCATE t; -- wait

  - session1
  SELECT * FROM pg_get_target_backend_memory_contexts(); --wait


Thanks for notifying me, Fujii-san.


Attached v8 patch that prohibited calling the function inside transactions.


Regrettably, this modification could not cope with the advisory lock and
I haven't come up with a good way to deal with it.

It seems to me that the architecture of the requestor waiting for the
dumper leads to this problem and complicates things.


Considering the discussion printing backtrace discussion[1], it seems
reasonable that the requestor just sends a signal and dumper dumps to
the log file.


+1

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: pgbench: option delaying queries till connections establishment?

2021-03-04 Thread Thomas Munro
On Thu, Mar 4, 2021 at 10:44 PM Thomas Munro  wrote:
> v10-0002-pgbench-Refactor-the-way-we-do-thread-portabilit.patch

Here's a better version of that part.  I don't yet know if it actually
works on Windows...
From 3aa63dfc086ab1f687ed668091a6bda8bf270fa7 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sat, 2 Jan 2021 15:05:06 +1300
Subject: [PATCH v11 1/6] Add missing pthread_barrier_t.

Supply a simple implementation of the missing pthread_barrier_t type and
functions, for macOS.

Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de
---
 configure   | 69 +
 configure.ac|  2 +
 src/include/pg_config.h.in  |  3 ++
 src/include/port/pg_pthread.h   | 41 
 src/port/pthread_barrier_wait.c | 66 +++
 src/tools/msvc/Solution.pm  |  1 +
 6 files changed, 182 insertions(+)
 create mode 100644 src/include/port/pg_pthread.h
 create mode 100644 src/port/pthread_barrier_wait.c

diff --git a/configure b/configure
index ce9ea36999..fad817bb38 100755
--- a/configure
+++ b/configure
@@ -11635,6 +11635,62 @@ if test "$ac_res" != no; then :
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing pthread_barrier_wait" >&5
+$as_echo_n "checking for library containing pthread_barrier_wait... " >&6; }
+if ${ac_cv_search_pthread_barrier_wait+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char pthread_barrier_wait ();
+int
+main ()
+{
+return pthread_barrier_wait ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' pthread; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_pthread_barrier_wait=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext
+  if ${ac_cv_search_pthread_barrier_wait+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search_pthread_barrier_wait+:} false; then :
+
+else
+  ac_cv_search_pthread_barrier_wait=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_pthread_barrier_wait" >&5
+$as_echo "$ac_cv_search_pthread_barrier_wait" >&6; }
+ac_res=$ac_cv_search_pthread_barrier_wait
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+fi
+
 # Solaris:
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing fdatasync" >&5
 $as_echo_n "checking for library containing fdatasync... " >&6; }
@@ -15883,6 +15939,19 @@ esac
 
 fi
 
+ac_fn_c_check_func "$LINENO" "pthread_barrier_wait" "ac_cv_func_pthread_barrier_wait"
+if test "x$ac_cv_func_pthread_barrier_wait" = xyes; then :
+  $as_echo "#define HAVE_PTHREAD_BARRIER_WAIT 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" pthread_barrier_wait.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS pthread_barrier_wait.$ac_objext"
+ ;;
+esac
+
+fi
+
 ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite"
 if test "x$ac_cv_func_pwrite" = xyes; then :
   $as_echo "#define HAVE_PWRITE 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index f54f65febe..0ed53571dd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1143,6 +1143,7 @@ AC_SEARCH_LIBS(getopt_long, [getopt gnugetopt])
 AC_SEARCH_LIBS(shm_open, rt)
 AC_SEARCH_LIBS(shm_unlink, rt)
 AC_SEARCH_LIBS(clock_gettime, [rt posix4])
+AC_SEARCH_LIBS(pthread_barrier_wait, pthread)
 # Solaris:
 AC_SEARCH_LIBS(fdatasync, [rt posix4])
 # Required for thread_test.c on Solaris
@@ -1743,6 +1744,7 @@ AC_REPLACE_FUNCS(m4_normalize([
 	mkdtemp
 	pread
 	preadv
+	pthread_barrier_wait
 	pwrite
 	pwritev
 	random
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 04dc330119..7a7cc21d8d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -424,6 +424,9 @@
 /* Define if you have POSIX threads libraries and header files. */
 #undef HAVE_PTHREAD
 
+/* Define to 1 if you have the `pthread_barrier_wait' function. */
+#undef HAVE_PTHREAD_BARRIER_WAIT
+
 /* Define to 1 if you have the `pthread_is_threaded_np' function. */
 #undef HAVE_PTHREAD_IS_THREADED_NP
 
diff --git a/src/include/port/pg_pthread.h b/src/include/port/pg_pthread.h
new file mode 100644
index 00..5222cdce6e
--- /dev/null
+++ b/src/include/port/pg_pthread.h
@@ -0,0 +1,41 @@
+/*-
+ *
+ * Declarations for missing POSIX thread components.
+ *
+ *	  Currently this supplies an implementation of pthread_barrier_t for the
+ *	  benefit of 

Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Andrey Borodin



> 5 марта 2021 г., в 08:32, Tom Lane  написал(а):
> 
> Kyotaro Horiguchi  writes:
>> I noticed that 011_crash_recovery.pl intermittently (that being said,
>> one out of three or so on my environment) fails in the second test.
> 
> Hmmm ... what environment is that?  This test script hasn't changed
> meaningfully in several years, and we have not seen any real issues
> with it up to now.

Maybe it's offtopic here, but anyway...
While working on "lz4 for FPIs" I've noticed that this test fails with 
wal_compression = on.
I did not investigate the case at that moment, but I think that it would be 
good to run recovery regression tests with compression too.

Best regards, Andrey Borodin.



Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Kyotaro Horiguchi
At Fri, 05 Mar 2021 13:21:48 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Fri, 05 Mar 2021 13:13:04 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > At Thu, 04 Mar 2021 23:02:09 -0500, Tom Lane  wrote in 
> > > Having said that, it's still true that this test has been stable in
> > > the buildfarm.  Andres explained what he had to perturb to make it
> > > fail, so I'm still interested in what Horiguchi-san did to break it.
> > 
> > CONFIGURE =  '--enable-debug' '--enable-cassert' '--enable-tap-tests' 
> > '--enable-depend' '--enable-nls' '--with-icu' '--with-openssl' 
> > '--with-libxml' '--with-uuid=e2fs' '--with-tcl' '--with-llvm' 
> > '--prefix=/home/horiguti/bin/pgsql_work' 
> > 'LLVM_CONFIG=/usr/bin/llvm-config-64' 'CC=/usr/lib64/ccache/gcc' 
> > 'CLANG=/usr/lib64/ccache/clang' 'CFLAGS=-O0' '--with-wal-blocksize=16'
> > 
> > the WAL block size might have affected.  I'll recheck without it.
> 
> Ok, I don't see the failure.  It guess that the WAL records for the
> last transaction crosses a block boundary with 8kB WAL blocks, but not
> with 16kB blocks.

In the failure case with 16kB WAL blocks, tx538 ends with a commit
record at 0/01648B98 and nothing follows (other than the recrods added
after restart).

In the successful case, tx538 ends at the same LSN and followed by
INSERT@tx539 at0/1648CE0 up to INSERT_LEAF at 0/0165BFD8-1. So tx539
just fits inside the block (0x1648000-0x164BFFF). That amount of WAL
must cross a 8kB bounary.

Actually with 8kB blocks, tx538 ends at 0/0164B1A8 and tx539 starts at
0/0164B2A8 and ends at 0/0165E7C8, corsses a boundary at 0/0164C000
and 0/016E000.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: n_mod_since_analyze isn't reset at table truncation

2021-03-04 Thread Masahiko Sawada
On Thu, Mar 4, 2021 at 12:39 PM Julien Rouhaud  wrote:
>
> On Thu, Mar 04, 2021 at 12:21:14PM +0900, Fujii Masao wrote:
> >
> >
> > On 2021/03/04 11:24, Julien Rouhaud wrote:
> > > On Thu, Mar 04, 2021 at 10:35:19AM +0900, Masahiko Sawada wrote:
> > > >
> > > > While reviewing "autoanalyze on partitioned table" patch, I realized
> > > > that pg_stat_xxx_tables.n_mod_since_analyze is not reset at TRUNCATE.
> > > > On the other hand, n_ins_since_vacuum is reset. I think it should be
> > > > reset because otherwise we end up unnecessarily triggering autoanalyze
> > > > even when the actual number of newly inserted tuples is less than the
> > > > autoanalyze thresholds.
> >
> > In that case, conversely, we want to trigger autoanalyze ASAP because the 
> > contents in the table was changed very much?
>
> We might want, but wouldn't keeping the current n_mod_since_analyze would make
> things unpredictable?

Right. I guess we should manually execute ANALYZE on the table to
refresh the statistics if we want to do that ASAP. It seems to me it
doesn't make any sense to trigger autoanalyze earlier than expected.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: n_mod_since_analyze isn't reset at table truncation

2021-03-04 Thread Masahiko Sawada
On Thu, Mar 4, 2021 at 11:23 AM Julien Rouhaud  wrote:
>
> On Thu, Mar 04, 2021 at 10:35:19AM +0900, Masahiko Sawada wrote:
> >
> > While reviewing "autoanalyze on partitioned table" patch, I realized
> > that pg_stat_xxx_tables.n_mod_since_analyze is not reset at TRUNCATE.
> > On the other hand, n_ins_since_vacuum is reset. I think it should be
> > reset because otherwise we end up unnecessarily triggering autoanalyze
> > even when the actual number of newly inserted tuples is less than the
> > autoanalyze thresholds.
>
> Agreed.
>
> > I've attached a small patch to fix this. Please review it.
>
> The patch looks sensible to me, but the stats.sql (around l. 158) test should
> be modified to also check for effect on that field.

Thank you for looking at the patch!

Agreed. I've attached the updated patch.

I was going to add tests for both n_mod_since_analyze and
n_ins_since_analyze but it seems to require somewhat change pgstat
code to check  n_ins_since_vacuum. Since n_ins_since_vacuum internally
uses the counter used also for n_tup_ins whose value isn't reset at
TRUNCATE, the values isn’t reset in some cases, for example, where a
table stats message for a new table has a truncation information
(i.g., tabmsg->t_counts.t_truncated is true). For example, even if we
add a check of n_ins_since_vacuum in stats.sql (at L158), the value is
not reset by running stats.sql regression test. So in this patch, I
added a check just for n_mod_since_analyze.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


reset_n_mod_since_analyze_v2.patch
Description: Binary data


Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Mar 5, 2021 at 5:10 PM Tom Lane  wrote:
>> Alternatively, maybe we can salvage the function's usefulness by making it
>> flush WAL before returning?

> To make pg_xact_status()'s result reliable, don't you need to make
> pg_current_xact_id() flush?  In other words, isn't it at the point
> that you *observe* the transaction that you have to make sure that
> this transaction ID won't be reused after crash recovery.  Before
> that, it's simultaneously allocated and unallocated, like the cat.

We need to be sure that the XID is written out to WAL before we
let the client see it, yeah.  I've not looked to see exactly
where in the code would be the best place.

BTW, I tried simply removing the "allows_streaming" option from
the test, and it failed ten times out of ten tries for me.
So Andres is right that that makes it pretty reproducible in
a stock build.

regards, tom lane




Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Thomas Munro
On Fri, Mar 5, 2021 at 5:10 PM Tom Lane  wrote:
> I wrote:
> > I'd be kind of inclined to remove this test script altogether, on the
> > grounds that it's wasting cycles on a function that doesn't really
> > do what is claimed (and we should remove the documentation claim, too).
>
> Alternatively, maybe we can salvage the function's usefulness by making it
> flush WAL before returning?

To make pg_xact_status()'s result reliable, don't you need to make
pg_current_xact_id() flush?  In other words, isn't it at the point
that you *observe* the transaction that you have to make sure that
this transaction ID won't be reused after crash recovery.  Before
that, it's simultaneously allocated and unallocated, like the cat.




Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-04 Thread Fujii Masao




On 2021/03/05 11:26, miyake_kouta wrote:

2021-03-04 21:11, Michael Paquier wrote:

Like any other src/bin/ facilities, let's instead
remove all the getenv() calls at the beginning of pgbench.c and let's
libpq handle those environment variables if the parameters are NULL
(aka in the case of no values given directly in the options).  This
requires to move the debug log after doConnect(), which is no big deal
anyway as a failure results in an exit(1) immediately with a log
telling where the connection failed.


if ((env = getenv("PGDATABASE")) != NULL && *env != '\0')
dbName = env;
-   else if (login != NULL && *login != '\0')
-   dbName = login;
+   else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
+   dbName = env;
else
-   dbName = "";
+   dbName = get_user_name_or_exit(progname);

If dbName is set by libpq, the above also is not necessary?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Kyotaro Horiguchi
At Fri, 05 Mar 2021 13:13:04 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Thu, 04 Mar 2021 23:02:09 -0500, Tom Lane  wrote in 
> > Peter Geoghegan  writes:
> > > On Thu, Mar 4, 2021 at 7:32 PM Tom Lane  wrote:
> > >> Hmmm ... what environment is that?  This test script hasn't changed
> > >> meaningfully in several years, and we have not seen any real issues
> > >> with it up to now.
> > 
> > > Did you see this recent thread?
> > > https://postgr.es/m/20210208215206.mqmrnpkaqrdtm...@alap3.anarazel.de
> > 
> > Hadn't paid much attention at the time, but yeah, it looks like Andres
> > tripped over some variant of this.
> > 
> > I'd be kind of inclined to remove this test script altogether, on the
> > grounds that it's wasting cycles on a function that doesn't really
> > do what is claimed (and we should remove the documentation claim, too).
> > 
> > Having said that, it's still true that this test has been stable in
> > the buildfarm.  Andres explained what he had to perturb to make it
> > fail, so I'm still interested in what Horiguchi-san did to break it.
> 
> CONFIGURE =  '--enable-debug' '--enable-cassert' '--enable-tap-tests' 
> '--enable-depend' '--enable-nls' '--with-icu' '--with-openssl' 
> '--with-libxml' '--with-uuid=e2fs' '--with-tcl' '--with-llvm' 
> '--prefix=/home/horiguti/bin/pgsql_work' 
> 'LLVM_CONFIG=/usr/bin/llvm-config-64' 'CC=/usr/lib64/ccache/gcc' 
> 'CLANG=/usr/lib64/ccache/clang' 'CFLAGS=-O0' '--with-wal-blocksize=16'
> 
> the WAL block size might have affected.  I'll recheck without it.

Ok, I don't see the failure.  It guess that the WAL records for the
last transaction crosses a block boundary with 8kB WAL blocks, but not
with 16kB blocks.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Kyotaro Horiguchi
At Thu, 04 Mar 2021 23:02:09 -0500, Tom Lane  wrote in 
> Peter Geoghegan  writes:
> > On Thu, Mar 4, 2021 at 7:32 PM Tom Lane  wrote:
> >> Hmmm ... what environment is that?  This test script hasn't changed
> >> meaningfully in several years, and we have not seen any real issues
> >> with it up to now.
> 
> > Did you see this recent thread?
> > https://postgr.es/m/20210208215206.mqmrnpkaqrdtm...@alap3.anarazel.de
> 
> Hadn't paid much attention at the time, but yeah, it looks like Andres
> tripped over some variant of this.
> 
> I'd be kind of inclined to remove this test script altogether, on the
> grounds that it's wasting cycles on a function that doesn't really
> do what is claimed (and we should remove the documentation claim, too).
> 
> Having said that, it's still true that this test has been stable in
> the buildfarm.  Andres explained what he had to perturb to make it
> fail, so I'm still interested in what Horiguchi-san did to break it.

CONFIGURE =  '--enable-debug' '--enable-cassert' '--enable-tap-tests' 
'--enable-depend' '--enable-nls' '--with-icu' '--with-openssl' '--with-libxml' 
'--with-uuid=e2fs' '--with-tcl' '--with-llvm' 
'--prefix=/home/horiguti/bin/pgsql_work' 'LLVM_CONFIG=/usr/bin/llvm-config-64' 
'CC=/usr/lib64/ccache/gcc' 'CLANG=/usr/lib64/ccache/clang' 'CFLAGS=-O0' 
'--with-wal-blocksize=16'

the WAL block size might have affected.  I'll recheck without it.

FWIW xfs/CentOS8/VirtuaBox/Windows10

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Tom Lane
I wrote:
> I'd be kind of inclined to remove this test script altogether, on the
> grounds that it's wasting cycles on a function that doesn't really
> do what is claimed (and we should remove the documentation claim, too).

Alternatively, maybe we can salvage the function's usefulness by making it
flush WAL before returning?

If we go that route, then we have the opposite problem with respect
to the test script: rather than trying to make it paper over the
function's problems, we ought to try to make it reliably fail with
the code as it stands.

regards, tom lane




RE: [PATCH] pgbench: improve \sleep meta command

2021-03-04 Thread kuroda.hay...@fujitsu.com
Dear Miyake-san, 

>   I'm not sure but I think this is caused because `my_command->argv[2]` 
> becomes "foo".

Sorry, I missed some lines in your patch. Please ignore this analysis.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Tom Lane
Peter Geoghegan  writes:
> On Thu, Mar 4, 2021 at 7:32 PM Tom Lane  wrote:
>> Hmmm ... what environment is that?  This test script hasn't changed
>> meaningfully in several years, and we have not seen any real issues
>> with it up to now.

> Did you see this recent thread?
> https://postgr.es/m/20210208215206.mqmrnpkaqrdtm...@alap3.anarazel.de

Hadn't paid much attention at the time, but yeah, it looks like Andres
tripped over some variant of this.

I'd be kind of inclined to remove this test script altogether, on the
grounds that it's wasting cycles on a function that doesn't really
do what is claimed (and we should remove the documentation claim, too).

Having said that, it's still true that this test has been stable in
the buildfarm.  Andres explained what he had to perturb to make it
fail, so I'm still interested in what Horiguchi-san did to break it.

regards, tom lane




Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW

2021-03-04 Thread Japin Li


On Thu, 04 Mar 2021 at 14:53, Bharath Rupireddy 
 wrote:
> Thanks! I will look forward for more review comments.
>

v4-0001-Rearrange-Refresh-Mat-View-Code.patch
-

+static Oid
+get_new_heap_oid(RefreshMatViewStmt *stmt, Relation matviewRel, Oid matviewOid,
+char *relpersistence)
+{
+   Oid OIDNewHeap;
+   boolconcurrent;
+   Oid tableSpace;
+
+   concurrent = stmt->concurrent;
+
+   /* Concurrent refresh builds new data in temp tablespace, and does 
diff. */
+   if (concurrent)
+   {
+   tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP, false);
+   *relpersistence = RELPERSISTENCE_TEMP;
+   }

Since the concurrent only use in one place, I think we can remove the local 
variable
concurrent in get_new_heap_oid().

The others looks good to me.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




RE: [PATCH] pgbench: improve \sleep meta command

2021-03-04 Thread kuroda.hay...@fujitsu.com
Dear Miyake-san,

I agree your suggestions and I think this patch is basically good.
I put some comments:

* When the following line is input, the error message is not happy.
  I think output should be " \sleep command argument must be an integer...".

\sleep foo
-> pgbench: fatal: test.sql:5: unrecognized time unit, must be us, ms or s 
(foo) in command "sleep"
   \sleep foo
  ^ error found here

  I'm not sure but I think this is caused because `my_command->argv[2]` becomes 
"foo".

* A blank is missed in some lines, for example:

> + if (my_command->argc ==2)

  A blank should be added between a variable and an operator.


Could you fix them?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: About to add WAL write/fsync statistics to pg_stat_wal view

2021-03-04 Thread Fujii Masao



On 2021/03/05 8:38, Masahiro Ikeda wrote:

On 2021-03-05 01:02, Fujii Masao wrote:

On 2021/03/04 16:14, Masahiro Ikeda wrote:

On 2021-03-03 20:27, Masahiro Ikeda wrote:

On 2021-03-03 16:30, Fujii Masao wrote:

On 2021/03/03 14:33, Masahiro Ikeda wrote:

On 2021-02-24 16:14, Fujii Masao wrote:

On 2021/02/15 11:59, Masahiro Ikeda wrote:

On 2021-02-10 00:51, David G. Johnston wrote:

On Thu, Feb 4, 2021 at 4:45 PM Masahiro Ikeda
 wrote:


I pgindented the patches.


... XLogWrite, which is invoked during an
XLogFlush request (see ...).  This is also
incremented by the WAL receiver during replication.

("which normally called" should be "which is normally called" or
"which normally is called" if you want to keep true to the original)
You missed the adding the space before an opening parenthesis here and
elsewhere (probably copy-paste)

is ether -> is either
"This parameter is off by default as it will repeatedly query the
operating system..."
", because" -> "as"


Thanks, I fixed them.


wal_write_time and the sync items also need the note: "This is also
incremented by the WAL receiver during replication."


I skipped changing it since I separated the stats for the WAL receiver
in pg_stat_wal_receiver.


"The number of times it happened..." -> " (the tally of this event is
reported in wal_buffers_full in) This is undesirable because ..."


Thanks, I fixed it.


I notice that the patch for WAL receiver doesn't require explicitly
computing the sync statistics but does require computing the write
statistics.  This is because of the presence of issue_xlog_fsync but
absence of an equivalent pg_xlog_pwrite.  Additionally, I observe that
the XLogWrite code path calls pgstat_report_wait_*() while the WAL
receiver path does not.  It seems technically straight-forward to
refactor here to avoid the almost-duplicated logic in the two places,
though I suspect there may be a trade-off for not adding another
function call to the stack given the importance of WAL processing
(though that seems marginalized compared to the cost of actually
writing the WAL).  Or, as Fujii noted, go the other way and don't have
any shared code between the two but instead implement the WAL receiver
one to use pg_stat_wal_receiver instead.  In either case, this
half-and-half implementation seems undesirable.


OK, as Fujii-san mentioned, I separated the WAL receiver stats.
(v10-0002-Makes-the-wal-receiver-report-WAL-statistics.patch)


Thanks for updating the patches!



I added the infrastructure code to communicate the WAL receiver stats messages 
between the WAL receiver and the stats collector, and
the stats for WAL receiver is counted in pg_stat_wal_receiver.
What do you think?


On second thought, this idea seems not good. Because those stats are
collected between multiple walreceivers, but other values in
pg_stat_wal_receiver is only related to the walreceiver process running
at that moment. IOW, it seems strange that some values show dynamic
stats and the others show collected stats, even though they are in
the same view pg_stat_wal_receiver. Thought?


OK, I fixed it.
The stats collected in the WAL receiver is exposed in pg_stat_wal view in v11 
patch.


Thanks for updating the patches! I'm now reading 001 patch.

+    /* Check whether the WAL file was synced to disk right now */
+    if (enableFsync &&
+    (sync_method == SYNC_METHOD_FSYNC ||
+ sync_method == SYNC_METHOD_FSYNC_WRITETHROUGH ||
+ sync_method == SYNC_METHOD_FDATASYNC))
+    {

Isn't it better to make issue_xlog_fsync() return immediately
if enableFsync is off, sync_method is open_sync or open_data_sync,
to simplify the code more?


Thanks for the comments.
I added the above code in v12 patch.



+    /*
+ * Send WAL statistics only if WalWriterDelay has elapsed to minimize
+ * the overhead in WAL-writing.
+ */
+    if (rc & WL_TIMEOUT)
+    pgstat_send_wal();

On second thought, this change means that it always takes wal_writer_delay
before walwriter's WAL stats is sent after XLogBackgroundFlush() is called.
For example, if wal_writer_delay is set to several seconds, some values in
pg_stat_wal would be not up-to-date meaninglessly for those seconds.
So I'm thinking to withdraw my previous comment and it's ok to send
the stats every after XLogBackgroundFlush() is called. Thought?


Thanks, I didn't notice that.

Although PGSTAT_STAT_INTERVAL is 500msec, wal_writer_delay's
default value is 200msec and it may be set shorter time.


Yeah, if wal_writer_delay is set to very small value, there is a risk
that the WAL stats are sent too frequently. I agree that's a problem.



Why don't to make another way to check the timestamp?

+   /*
+    * Don't send a message unless it's been at least
PGSTAT_STAT_INTERVAL
+    * msec since we last sent one
+    */
+   now = GetCurrentTimestamp();
+   if (TimestampDifferenceExceeds(last_report, now,

Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Peter Geoghegan
On Thu, Mar 4, 2021 at 7:32 PM Tom Lane  wrote:
> Hmmm ... what environment is that?  This test script hasn't changed
> meaningfully in several years, and we have not seen any real issues
> with it up to now.

Did you see this recent thread?

https://postgr.es/m/20210208215206.mqmrnpkaqrdtm...@alap3.anarazel.de

-- 
Peter Geoghegan




Re: 011_crash_recovery.pl intermittently fails

2021-03-04 Thread Tom Lane
Kyotaro Horiguchi  writes:
> I noticed that 011_crash_recovery.pl intermittently (that being said,
> one out of three or so on my environment) fails in the second test.

Hmmm ... what environment is that?  This test script hasn't changed
meaningfully in several years, and we have not seen any real issues
with it up to now.

> If the server crashed before emitting WAL records for the transaction
> just started, the restarted server cannot know the xid is even
> started.  I'm not sure that is the intention of the test but we must
> make sure the WAL to be emitted before crashing.  CHECKPOINT ensures
> that.

The original commit for this test says


commit 857ee8e391ff6654ef9dcc5dd8b658d7709d0a3c
Author: Robert Haas 
Date:   Fri Mar 24 12:00:53 2017 -0400

Add a txid_status function.

If your connection to the database server is lost while a COMMIT is
in progress, it may be difficult to figure out whether the COMMIT was
successful or not.  This function will tell you, provided that you
don't wait too long to ask.  It may be useful in other situations,
too.

Craig Ringer, reviewed by Simon Riggs and by me

Discussion: 
http://postgr.es/m/camsr+yhqiwnei0dactbos40t+v5s_+dst3pyv_8v2wnvh+x...@mail.gmail.com


If the function needs a CHECKPOINT to give a reliable answer,
is it actually good for the claimed purpose?

Independently of that, I doubt that adding a checkpoint call
after the pg_current_xact_id() call is going to help.  The
Perl script is able to move on as soon as it's read the
function result.  If we need this hack, it has to be put
before that SELECT, AFAICS.

regards, tom lane




Re: [PATCH] remove deprecated v8.2 containment operators

2021-03-04 Thread Justin Pryzby
On Thu, Mar 04, 2021 at 08:58:39PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> > [ 0001-remove-deprecated-v8.2-containment-operators.patch ]
> 
> I'm confused by why this patch is only dropping the operators'
> opclass-membership links.  Don't we want to actually DROP OPERATOR
> too?

Okay

Also , I think it's unrelated to this patch, but shouldn't these be removed ?
See: b0b7be613, c15898c1d

+++ b/doc/src/sgml/brin.sgml

- Operator Strategy 7, 13, 16, 24, 25
+ Operator Strategy 7, 16, 24, 25

- Operator Strategy 8, 14, 26, 27
+ Operator Strategy 8, 26, 27


> Also, the patch seems to be trying to resurrect hstore--1.0--1.1.sql,

Not sure why or how I had that.

-- 
Justin
>From d9cfb33c1a87a8404fa949613500acd1021254cd Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 11 Apr 2020 22:57:06 -0500
Subject: [PATCH] remove deprecated v8.2 containment operators

See also:
ba920e1c9182eac55d5f1327ab0d29b721154277 684ad6a92fcc33adebdab65c4e7d72a68ba05408
3165426e54df02a6199c0a216546e5095e875a0a 2f70fdb0644c32c4154236c2b5c241bec92eac5e 591d282e8d3e0448ec1339c6b066e10953f040a2
---
 contrib/cube/Makefile   |  2 +-
 contrib/cube/cube--1.4--1.5.sql |  8 
 contrib/cube/cube.control   |  2 +-
 contrib/hstore/Makefile |  1 +
 contrib/hstore/hstore--1.8--1.9.sql |  7 +++
 contrib/hstore/hstore.control   |  2 +-
 contrib/intarray/Makefile   |  2 +-
 contrib/intarray/intarray--1.4--1.5.sql | 10 ++
 contrib/intarray/intarray.control   |  2 +-
 contrib/seg/Makefile|  2 +-
 contrib/seg/seg--1.3--1.4.sql   |  8 
 contrib/seg/seg.control |  2 +-
 doc/src/sgml/cube.sgml  |  8 
 doc/src/sgml/hstore.sgml| 10 --
 doc/src/sgml/intarray.sgml  |  8 
 doc/src/sgml/seg.sgml   |  8 
 16 files changed, 41 insertions(+), 41 deletions(-)
 create mode 100644 contrib/cube/cube--1.4--1.5.sql
 create mode 100644 contrib/hstore/hstore--1.8--1.9.sql
 create mode 100644 contrib/intarray/intarray--1.4--1.5.sql
 create mode 100644 contrib/seg/seg--1.3--1.4.sql

diff --git a/contrib/cube/Makefile b/contrib/cube/Makefile
index 54f609db17..cf195506c7 100644
--- a/contrib/cube/Makefile
+++ b/contrib/cube/Makefile
@@ -7,7 +7,7 @@ OBJS = \
 	cubeparse.o
 
 EXTENSION = cube
-DATA = cube--1.2.sql cube--1.2--1.3.sql cube--1.3--1.4.sql \
+DATA = cube--1.2.sql cube--1.2--1.3.sql cube--1.3--1.4.sql cube--1.4--1.5.sql \
 	cube--1.1--1.2.sql cube--1.0--1.1.sql
 PGFILEDESC = "cube - multidimensional cube data type"
 
diff --git a/contrib/cube/cube--1.4--1.5.sql b/contrib/cube/cube--1.4--1.5.sql
new file mode 100644
index 00..54492e5d18
--- /dev/null
+++ b/contrib/cube/cube--1.4--1.5.sql
@@ -0,0 +1,8 @@
+/* contrib/cube/cube--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION cube UPDATE TO '1.5'" to load this file. \quit
+
+-- Remove @ and ~
+DROP OPERATOR @ (cube, cube);
+DROP OPERATOR ~ (cube, cube);
diff --git a/contrib/cube/cube.control b/contrib/cube/cube.control
index 3e238fc937..50427ec117 100644
--- a/contrib/cube/cube.control
+++ b/contrib/cube/cube.control
@@ -1,6 +1,6 @@
 # cube extension
 comment = 'data type for multidimensional cubes'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/cube'
 relocatable = true
 trusted = true
diff --git a/contrib/hstore/Makefile b/contrib/hstore/Makefile
index c4e339b57c..97b228b65f 100644
--- a/contrib/hstore/Makefile
+++ b/contrib/hstore/Makefile
@@ -12,6 +12,7 @@ OBJS = \
 
 EXTENSION = hstore
 DATA = hstore--1.4.sql \
+	hstore--1.8--1.9.sql \
 	hstore--1.7--1.8.sql \
 	hstore--1.6--1.7.sql \
 	hstore--1.5--1.6.sql \
diff --git a/contrib/hstore/hstore--1.8--1.9.sql b/contrib/hstore/hstore--1.8--1.9.sql
new file mode 100644
index 00..7cd3467c55
--- /dev/null
+++ b/contrib/hstore/hstore--1.8--1.9.sql
@@ -0,0 +1,7 @@
+/* contrib/hstore/hstore--1.8--1.9.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION hstore UPDATE TO '1.9'" to load this file. \quit
+
+-- Remove @
+DROP OPERATOR @ (hstore, hstore);
diff --git a/contrib/hstore/hstore.control b/contrib/hstore/hstore.control
index 89e3c746c4..b73c28aa4d 100644
--- a/contrib/hstore/hstore.control
+++ b/contrib/hstore/hstore.control
@@ -1,6 +1,6 @@
 # hstore extension
 comment = 'data type for storing sets of (key, value) pairs'
-default_version = '1.8'
+default_version = '1.9'
 module_pathname = '$libdir/hstore'
 relocatable = true
 trusted = true
diff --git a/contrib/intarray/Makefile b/contrib/intarray/Makefile
index 01faa36b10..3817c1669a 100644
--- a/contrib/intarray/Makefile
+++ b/contrib/intarray/Makefile
@@ -12,7 +12,7 @@ OBJS = \
 	_intbig_gist.o
 
 EXTENSION = intarray
-DATA = intarray--1.3--1.4.sql intarray--1.2--1.3.sql 

Re: Track replica origin progress for Rollback Prepared

2021-03-04 Thread Amit Kapila
On Fri, Mar 5, 2021 at 7:11 AM Amit Kapila  wrote:
>
> On Thu, Mar 4, 2021 at 8:03 PM Amit Kapila  wrote:
> >
>
> I have just checked via code coverage that we don't seem to have tests
> for recovery of replication origin advance for commit [1], see
> function xact_redo_commit. Similarly, a similar code is not covered
> for prepare [2], see EndPrepare. I think overall the test cases for
> replication origins are not very many. Now, I think if we want to have
> more tests in this area then we need to look at it more broadly. I
> think it could be that currently only subscriber-side covers some part
> of origins testing, otherwise, we don't have a detailed test suite by
> using replication origin APIs and second is probably it might be
> tricky to write a reliable recovery test case. One idea could be to
> just write a test for the non-recovery code path (the code added in
> function RecordTransactionAbortPrepared) and leave recovery testing
> for now.
>

For non-recovery, I have to write a test like below:

SELECT pg_replication_origin_create('regress');
SELECT pg_replication_origin_session_setup('regress');
Begin;
SELECT pg_replication_origin_xact_setup('0/aabbccdd', '2013-01-01 00:00');
Insert into t1 values(1);
Prepare Transaction 'foo';
SELECT pg_replication_origin_xact_setup('0/aabbccee', '2013-01-01 00:00');
Rollback Prepared 'foo';
SELECT pg_replication_origin_session_progress(true);

The thing which bothers me about such a test would be to call
pg_replication_origin_xact_setup after 'Prepare Transaction', it seems
like I am exploiting this API for this test. This API is not meant for
this purpose, so not sure if I should go ahead with this test or not.
What do you think? If you can think of any other alternative then let
me know? We don't have a very good infrastructure to test origins and
its progress except for the subscriber-side code.

-- 
With Regards,
Amit Kapila.




Re: Corruption during WAL replay

2021-03-04 Thread Kyotaro Horiguchi
At Thu, 4 Mar 2021 22:37:23 +0500, Ibrar Ahmed  wrote in 
> The regression is failing for this patch, do you mind look at that and send
> the updated patch?
> 
> https://api.cirrus-ci.com/v1/task/6313174510075904/logs/test.log
> 
> ...
> t/006_logical_decoding.pl  ok
> t/007_sync_rep.pl  ok
> Bailout called.  Further testing stopped:  system pg_ctl failed
> FAILED--Further testing stopped: system pg_ctl failed
> make[2]: *** [Makefile:19: check] Error 255
> make[1]: *** [Makefile:49: check-recovery-recurse] Error 2
> make: *** [GNUmakefile:71: check-world-src/test-recurse] Error 2
> ...

(I regret that I sent this as .patch file..)

Thaks for pointing that!

The patch assumed that CHKPT_START/COMPLETE barrier are exclusively
used each other, but MarkBufferDirtyHint which delays checkpoint start
is called in RelationTruncate while delaying checkpoint completion.
That is not a strange nor harmful behavior.  I changed delayChkpt to a
bitmap integer from an enum so that both barrier are separately
triggered.

I'm not sure this is the way to go here, though.  This fixes the issue
of a crash during RelationTruncate, but the issue of smgrtruncate
failure during RelationTruncate still remains (unless we treat that
failure as PANIC?).

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/multixact.c 
b/src/backend/access/transam/multixact.c
index 1f9f1a1fa1..c1b0b48362 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3072,8 +3072,8 @@ TruncateMultiXact(MultiXactId newOldestMulti, Oid 
newOldestMultiDB)
 * crash/basebackup, even though the state of the data directory would
 * require it.
 */
-   Assert(!MyProc->delayChkpt);
-   MyProc->delayChkpt = true;
+   Assert((MyProc->delayChkpt & DELAY_CHKPT_START) == 0);
+   MyProc->delayChkpt |= DELAY_CHKPT_START;
 
/* WAL log truncation */
WriteMTruncateXlogRec(newOldestMultiDB,
@@ -3099,7 +3099,7 @@ TruncateMultiXact(MultiXactId newOldestMulti, Oid 
newOldestMultiDB)
/* Then offsets */
PerformOffsetsTruncation(oldestMulti, newOldestMulti);
 
-   MyProc->delayChkpt = false;
+   MyProc->delayChkpt &= ~DELAY_CHKPT_START;
 
END_CRIT_SECTION();
LWLockRelease(MultiXactTruncationLock);
diff --git a/src/backend/access/transam/twophase.c 
b/src/backend/access/transam/twophase.c
index 80d2d20d6c..85c720491b 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -463,7 +463,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId 
xid, const char *gid,
proc->lxid = (LocalTransactionId) xid;
proc->xid = xid;
Assert(proc->xmin == InvalidTransactionId);
-   proc->delayChkpt = false;
+   proc->delayChkpt = 0;
proc->statusFlags = 0;
proc->pid = 0;
proc->backendId = InvalidBackendId;
@@ -1109,7 +1109,8 @@ EndPrepare(GlobalTransaction gxact)
 
START_CRIT_SECTION();
 
-   MyProc->delayChkpt = true;
+   Assert((MyProc->delayChkpt & DELAY_CHKPT_START) == 0);
+   MyProc->delayChkpt |= DELAY_CHKPT_START;
 
XLogBeginInsert();
for (record = records.head; record != NULL; record = record->next)
@@ -1152,7 +1153,7 @@ EndPrepare(GlobalTransaction gxact)
 * checkpoint starting after this will certainly see the gxact as a
 * candidate for fsyncing.
 */
-   MyProc->delayChkpt = false;
+   MyProc->delayChkpt &= ~DELAY_CHKPT_START;
 
/*
 * Remember that we have this GlobalTransaction entry locked for us.  If
@@ -2198,7 +2199,8 @@ RecordTransactionCommitPrepared(TransactionId xid,
START_CRIT_SECTION();
 
/* See notes in RecordTransactionCommit */
-   MyProc->delayChkpt = true;
+   Assert((MyProc->delayChkpt & DELAY_CHKPT_START) == 0);
+   MyProc->delayChkpt |= DELAY_CHKPT_START;
 
/*
 * Emit the XLOG commit record. Note that we mark 2PC commits as
@@ -2246,7 +2248,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
TransactionIdCommitTree(xid, nchildren, children);
 
/* Checkpoint can proceed now */
-   MyProc->delayChkpt = false;
+   MyProc->delayChkpt &= ~DELAY_CHKPT_START;
 
END_CRIT_SECTION();
 
diff --git a/src/backend/access/transam/xact.c 
b/src/backend/access/transam/xact.c
index 4e6a3df6b8..f033e8940a 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1334,8 +1334,9 @@ RecordTransactionCommit(void)
 * This makes checkpoint's determination of which xacts are 
delayChkpt
 * a bit fuzzy, but it doesn't matter.
 */
+   Assert((MyProc->delayChkpt & DELAY_CHKPT_START) == 0);
START_CRIT_SECTION();
-   MyProc->delayChkpt = true;
+   MyProc->delayChkpt |= DELAY_CHKPT_START;
 
 

011_crash_recovery.pl intermittently fails

2021-03-04 Thread Kyotaro Horiguchi
Hello.

I noticed that 011_crash_recovery.pl intermittently (that being said,
one out of three or so on my environment) fails in the second test.

> t/011_crash_recovery.pl .. 2/3 
> #   Failed test 'new xid after restart is greater'
> #   at t/011_crash_recovery.pl line 56.
> # '539'
> # >
> # '539'
> 
> #   Failed test 'xid is aborted after crash'
> #   at t/011_crash_recovery.pl line 60.
> #  got: 'committed'
> # expected: 'aborted'
> # Looks like you failed 2 tests of 3.
> t/011_crash_recovery.pl .. Dubious, test returned 2 (wstat 512, 0x200)
> Failed 2/3 subtests 
> 
> Test Summary Report
> ---
> t/011_crash_recovery.pl (Wstat: 512 Tests: 3 Failed: 2)
>   Failed tests:  2-3
>   Non-zero exit status: 2
> Files=1, Tests=3,  3 wallclock secs ( 0.03 usr  0.01 sys +  1.90 cusr  0.39 
> csys =  2.33 CPU)
> Result: FAIL

If the server crashed before emitting WAL records for the transaction
just started, the restarted server cannot know the xid is even
started.  I'm not sure that is the intention of the test but we must
make sure the WAL to be emitted before crashing.  CHECKPOINT ensures
that.

Thoughts?  The attached seems to stabilize the test for me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/test/recovery/t/011_crash_recovery.pl b/src/test/recovery/t/011_crash_recovery.pl
index 10cd98f70a..66d7b8c871 100644
--- a/src/test/recovery/t/011_crash_recovery.pl
+++ b/src/test/recovery/t/011_crash_recovery.pl
@@ -35,6 +35,7 @@ $stdin .= q[
 BEGIN;
 CREATE TABLE mine(x integer);
 SELECT pg_current_xact_id();
+CHECKPOINT;
 ];
 $tx->pump until $stdout =~ /[[:digit:]]+[\r\n]$/;
 


Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-04 Thread miyake_kouta

2021-03-04 21:11, Michael Paquier wrote:

Like any other src/bin/ facilities, let's instead
remove all the getenv() calls at the beginning of pgbench.c and let's
libpq handle those environment variables if the parameters are NULL
(aka in the case of no values given directly in the options).  This
requires to move the debug log after doConnect(), which is no big deal
anyway as a failure results in an exit(1) immediately with a log
telling where the connection failed.


Thank you for improving my patch.
I agree that we should remove getenv() from pgbench.c and let libpq 
complement parameters with environment variables.
As you said, it's not a big problem that the debug log output  comes 
after doConnect(), I think.

Your patch is simpler and more ideal.

Regards.
--
Kota Miyake




Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)

2021-03-04 Thread Andy Fan
On Fri, Mar 5, 2021 at 12:00 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:

> > On Thu, Feb 18, 2021 at 08:58:13PM +0800, Andy Fan wrote:
>
> Thanks for continuing work on this patch!
>
> > On Tue, Feb 16, 2021 at 12:01 PM David Rowley 
> wrote:
> >
> > > On Fri, 12 Feb 2021 at 15:18, Andy Fan 
> wrote:
> > > >
> > > > On Fri, Feb 12, 2021 at 9:02 AM David Rowley 
> > > wrote:
> > > >> The reason I don't really like this is that it really depends where
> > > >> you want to use RelOptInfo.notnullattrs.  If someone wants to use it
> > > >> to optimise something before the base quals are evaluated then they
> > > >> might be unhappy that they found some NULLs.
> > > >>
> > > >
> > > > Do you mean the notnullattrs is not set correctly before the base
> quals
> > > are
> > > > evaluated?  I think we have lots of data structures which are set
> just
> > > after some
> > > > stage.  but notnullattrs is special because it is set at more than 1
> > > stage.  However
> > > > I'm doubtful it is unacceptable, Some fields ever change their
> meaning
> > > at different
> > > > stages like Var->varno.  If a user has a misunderstanding on it, it
> > > probably will find it
> > > > at the testing stage.
> > >
> > > You're maybe focusing too much on your use case for notnullattrs. It
> > > only cares about NULLs in the result for each query level.
> > >
> > >  thinks of an example...
> > >
> > > OK, let's say I decided that COUNT(*) is faster than COUNT(id) so
> > > decided that I might like to write a patch which rewrite the query to
> > > use COUNT(*) when it was certain that "id" could not contain NULLs.
> > >
> > > The query is:
> > >
> > > SELECT p.partid, p.partdesc,COUNT(s.saleid) FROM part p LEFT OUTER
> > > JOIN sales s ON p.partid = s.partid GROUP BY p.partid;
> > >
> > > sale.saleid is marked as NOT NULL in pg_attribute.  As the writer of
> > > the patch, I checked the comment for notnullattrs and it says "Not
> > > null attrs, start from -FirstLowInvalidHeapAttributeNumber", so I
> > > should be ok to assume since sales.saleid is marked in notnullattrs
> > > that I can rewrite the query?!
> > >
> > > The documentation about the RelOptInfo.notnullattrs needs to be clear
> > > what exactly it means. I'm not saying your representation of how to
> > > record NOT NULL in incorrect. I'm saying that you need to be clear
> > > what exactly is being recorded in that field.
> > >
> > > If you want it to mean "attribute marked here cannot output NULL
> > > values at this query level", then you should say something along those
> > > lines.
> > >
> > > However, having said that, because this is a Bitmapset of
> > > pg_attribute.attnums, it's only possible to record Vars from base
> > > relations.  It does not seem like you have any means to record
> > > attributes that are normally NULLable, but cannot produce NULL values
> > > due to a strict join qual.
> > >
> > > e.g: SELECT t.nullable FROM t INNER JOIN j ON t.nullable = j.something;
> > >
> > > I'd expect the RelOptInfo for t not to contain a bit for the
> > > "nullable" column, but there's no way to record the fact that the join
> > > RelOptInfo for {t,j} cannot produce a NULL for that column. It might
> > > be quite useful to know that for the UniqueKeys patch.
> > >
> >
> > I checked again and found I do miss the check on JoinExpr->quals.  I have
> > fixed it in v3 patch. Thanks for the review!
> >
> > In the attached v3,  commit 1 is the real patch, and commit 2 is just add
> > some logs to help local testing.  notnull.sql/notnull.out is the test
> case
> > for
> > this patch, both commit 2 and notnull.* are not intended to be committed
> > at last.
>
> Just to clarify, this version of notnullattrs here is the latest one,
> and another one from "UniqueKey on Partitioned table" thread should be
> disregarded?
>

Actually they are different sections for UniqueKey.  Since I don't want to
mess
two topics in one place, I open another thread.  The topic here is how to
represent
a not null attribute, which is a precondition for all UniqueKey stuff.  The
thread
" UniqueKey on Partitioned table[1] " is talking about how to maintain the
UniqueKey on a partitioned table only.


>
> > Besides the above fix in v3, I changed the comments alongs the
> notnullattrs
> > as below and added a true positive helper function is_var_nullable.
>
> With "true positive" you mean it will always correctly say if a Var is
> nullable or not?


not null.  If I say it is not null (return value is false), it is not null
for sure.   If
it is nullable (true),  it may be still not null for some stages.  But I
don't want
to distinguish them too much, so I just say it is nullable.


> I'm not sure about this, but couldn't be there still
> some cases when a Var belongs to nullable_baserels, but still has some
> constraints preventing it from being nullable (e.g. a silly example when
> the not nullable column belong to the table, and the query does full
> join of this table on itself using this 

Re: [PATCH] remove deprecated v8.2 containment operators

2021-03-04 Thread Tom Lane
Justin Pryzby  writes:
> [ 0001-remove-deprecated-v8.2-containment-operators.patch ]

I'm confused by why this patch is only dropping the operators'
opclass-membership links.  Don't we want to actually DROP OPERATOR
too?

Also, the patch seems to be trying to resurrect hstore--1.0--1.1.sql,
which I do not see the point of.  It was removed because no modern
server will even think it's valid syntax, and that situation has
not changed.

regards, tom lane




RE: Refactor ECPGconnect and allow IPv6 connection

2021-03-04 Thread kuroda.hay...@fujitsu.com
Dear Hackers,

I reviewed for myself and fixed something:

* refactor parse_options(), same as conninfo_uri_parse_params() in libpq
  Skipping blanks is needed in this functions because ecpg precompiler add 
additional blanks
  between option parameters. I did not fix precompiler because of the 
compatibility.
  If it changes, maybe SO_MAJOR_VERSION will be also changed.
* update doc

Parse_new/oldstlye() is not changed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v4-0001-refactor_ecpgconnect.patch
Description: v4-0001-refactor_ecpgconnect.patch


v4-0002-allow-IPv6.patch
Description: v4-0002-allow-IPv6.patch


Re: make coverage-html would fail within build directory separate from source tree

2021-03-04 Thread walker
Thanks for your reminder, Tom.


Before I understand VPATH well, I always thought "outside the source tree" 
means the build tree is not under source tree.


Here because of VPATH build, which means build tree should be a subdirectory of 
source tree. according to this rule, I retry this scenario with old version of 
lcov(1.10).
tar -zxf source_dir.tar.gz
cd source_dir
mkdir build_dir  cd build_dir
../configure --enable-coverage
make
make check
make coverage-html


And "make coverage-html" works fine, no any error, or warning, output is like 
this:
/bin/lcov --gcov-tool /bin/gcov -q --no-external -c -i -d . -d 
source_dir/build_dir/../ -o lcov_base.info
/bin/lcov --gcov-tool /bin/gcov -q --no-external -c -d . -d 
source_dir/build_dir/../ -o lcov_test.info
rm -rf coverage
/bin/genhtml -q --legend -o coverage --title='PostgreSQL 13.2' --num-spaces=4 
lcov_base.info lcov_test.info
touch coverage-html-stamp


thanks
walker




--Original--
From:   
 "Tom Lane" 
   


[PATCH] pgbench: improve \sleep meta command

2021-03-04 Thread miyake_kouta

Hi.

I created a patch to improve \sleep meta command in pgbench.

There are two problems with the current pgbench implementation of 
\sleep.
First, when the input is like "\sleep foo s" , this string will be 
treated as 0 through atoi function, and no error will be raised.
Second, when the input is like "\sleep :some_bool s" and some_bool is 
set to True or False, this bool will be treated as 0 as well.
However, I think we should catch this error, so I suggest my patch to 
detect this and raise errors.


Regards.
--
Kota Miyakediff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 31a4df45f5..3f28b022d7 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2860,7 +2860,18 @@ evaluateSleep(CState *st, int argc, char **argv, int *usecs)
 			pg_log_error("%s: undefined variable \"%s\"", argv[0], argv[1] + 1);
 			return false;
 		}
+
+		for (int i=0; iargc == 2 && my_command->argv[1][0] != ':')
+		if (my_command->argv[1][0] != ':')
 		{
 			char	   *c = my_command->argv[1];
 
@@ -4655,9 +4666,18 @@ process_backslash_command(PsqlScanState sstate, const char *source)
 c++;
 			if (*c)
 			{
-my_command->argv[2] = c;
-offsets[2] = offsets[1] + (c - my_command->argv[1]);
-my_command->argc = 3;
+if (my_command->argc ==2)
+{
+	my_command->argv[2] = c;
+	offsets[2] = offsets[1] + (c - my_command->argv[1]);
+	my_command->argc = 3;
+}
+else
+{
+	syntax_error(source, lineno, my_command->first_line, my_command->argv[0],
+			 "\\sleep command argument must be an integer",
+			 my_command->argv[1], offsets[1] - start_offset);
+}
 			}
 		}
 


Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-04 Thread Amit Kapila
On Fri, Mar 5, 2021 at 5:07 AM Greg Nancarrow  wrote:
>
> On Thu, Mar 4, 2021 at 11:05 PM Amit Kapila  wrote:
> >
> > On Thu, Mar 4, 2021 at 5:24 PM Greg Nancarrow  wrote:
> > >
> > > On Thu, Mar 4, 2021 at 10:07 PM Amit Kapila  
> > > wrote:
> > > >
> > >
> > > >Also, in
> > > > standard_planner, we should add these partitionOids only for
> > > > parallel-mode.
> > > >
> > >
> > > It is doing that in v20 patch (what makes you think it isn't?).
> > >
> >
> > The below code snippet:
> > + /* For AcquireExecutorLocks(). */
> > + if (IsModifySupportedInParallelMode(parse->commandType))
> > + result->partitionOids = glob->partitionOids;
> >
> > I understand that you have a check for the parallel mode in
> > AcquireExecutorLocks but why can't we have it before adding that to
> > planned statement
> >
>
> OK, I think I'm on the same wavelength now (sorry, I didn't realise
> you're talking about PlannedStmt).
>
> What  I believe you're suggesting is in the planner where
> partitionOids are "added" to the returned PlannedStmt, they should
> only be added if glob->parallelModeNeeded is true:.
>
> i.e.
>
> /* For AcquireExecutorLocks(). */
> if (glob->partitionOids != NIL && glob->parallelModeNeeded)
> result->partitionOids = glob->partitionOids;
>
> (seems reasonable to me, as then it will match the condition for which
> glob->partitionOids are added to glob->relationOids)
>
> then in plancache.c the check on parallelModeNeeded can be removed:
>
> /* Lock partitions ahead of modifying them in parallel mode. */
> if (rti == resultRelation &&
> plannedstmt->partitionOids != NIL)
> AcquireExecutorLocksOnPartitions(plannedstmt->partitionOids,
>  rte->rellockmode, acquire);
>
> Let me know if this matches what you were thinking.
>

Yes, that is what I was thinking. But I have another question as well
regarding tracking these partitions at two places (once in
plannedstmt->relationOids and the second time in
plannedstmt->partitionOids). I think it is better if you can prepare a
patch with all the comments raised till now leaving this last question
aside.

-- 
With Regards,
Amit Kapila.




Re: Track replica origin progress for Rollback Prepared

2021-03-04 Thread Amit Kapila
On Thu, Mar 4, 2021 at 8:03 PM Amit Kapila  wrote:
>
> Having said that, I think we use
> replication origins to test it. For example:
>
> Create Table t1(c1 int);
>
> SELECT pg_replication_origin_create('regress');
> SELECT pg_replication_origin_session_setup('regress');
> SELECT pg_replication_origin_xact_setup('0/aabbccdd', '2013-01-01 00:00');
> Begin;
> Select txid_current();
> Insert into t1 values(1);
> Prepare Transaction 'foo';
> SELECT pg_replication_origin_xact_setup('0/aabbccee', '2013-01-01 00:00');
> Rollback Prepared 'foo';
> SELECT pg_replication_origin_session_progress(true);
>
> -- Restart the server
> SELECT pg_replication_origin_session_setup('regress');
> SELECT pg_replication_origin_session_progress(true);
>
> The value before the patch will be aabbccdd and after the patch, it
> will be aabbccee.
>
> I think here we have a slight timing thing which is if the checkpoint
> happens before restart then the problem won't occur, not sure but
> maybe increase the checkpoint_timeout as well to make it reproducible.
> I am of opinion that this area won't change much and anyway once
> subscriber-side 2PC feature gets committed, we will have few tests
> covering this area but I am fine if you still insist to have something
> like above and think that we don't need to bother much about timing
> stuff.
>

I have just checked via code coverage that we don't seem to have tests
for recovery of replication origin advance for commit [1], see
function xact_redo_commit. Similarly, a similar code is not covered
for prepare [2], see EndPrepare. I think overall the test cases for
replication origins are not very many. Now, I think if we want to have
more tests in this area then we need to look at it more broadly. I
think it could be that currently only subscriber-side covers some part
of origins testing, otherwise, we don't have a detailed test suite by
using replication origin APIs and second is probably it might be
tricky to write a reliable recovery test case. One idea could be to
just write a test for the non-recovery code path (the code added in
function RecordTransactionAbortPrepared) and leave recovery testing
for now.

[1] - 
https://coverage.postgresql.org/src/backend/access/transam/xact.c.gcov.html
[2] - 
https://coverage.postgresql.org/src/backend/access/transam/twophase.c.gcov.html

-- 
With Regards,
Amit Kapila.




Re: Extensibility of the PostgreSQL wire protocol

2021-03-04 Thread Jan Wieck

On 3/4/21 7:38 PM, Hannu Krosing wrote:

On Thu, Mar 4, 2021 at 9:55 PM Jan Wieck  wrote:

but the whole thing was developed that way from the beginning and
it is working. I don't have a definitive date when that code will be
presented. Kuntal or Prateek may be able to fill in more details.


Are you really fully replacing the main loop, or are you running a second
main loop in parallel in the same database server instance, perhaps as
a separate TDS_postmaster backend ?

Will the data still also be accessible "as postgres" via port 5432 when
TDS/SQLServer support is active ?


The individual backend (session) is running a different main loop. A 
libpq based client will still get the regular libpq and the original 
PostgresMain() behavior on port 5432. The default port for TDS is 1433 
and with everything in place I can connect to the same database on that 
port with Microsoft's SQLCMD.


The whole point of all this is to allow the postmaster to listen to more 
than just 5432 and have different communication protocols on those 
*additional* ports. Nothing is really *replaced*. The parts of the 
backend, that do actual socket communication, are just routed through 
function pointers so that an extension can change their behavior.



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: Add support for PROVE_FLAGS and PROVE_TESTS in MSVC scripts

2021-03-04 Thread Michael Paquier
On Thu, Mar 04, 2021 at 06:37:56PM +0100, Juan José Santamaría Flecha wrote:
> LGTM.

Thanks.  I have tested more combinations of options, came back a bit
to the documentation for buildenv.pl where copying the values from the
docs would result in a warning, and applied it.
--
Michael


signature.asc
Description: PGP signature


Re: PoC/WIP: Extended statistics on expressions

2021-03-04 Thread Tomas Vondra
On 3/5/21 1:43 AM, Justin Pryzby wrote:
> On Mon, Jan 04, 2021 at 09:45:24AM -0600, Justin Pryzby wrote:
>> On Mon, Jan 04, 2021 at 03:34:08PM +, Dean Rasheed wrote:
>>> * I'm not sure I understand the need for 0001. Wasn't there an earlier
>>> version of this patch that just did it by re-populating the type
>>> array, but which still had it as an array rather than turning it into
>>> a list? Making it a list falsifies some of the comments and
>>> function/variable name choices in that file.
>>
>> This part is from me.
>>
>> I can review the names if it's desired , but it'd be fine to fall back to the
>> earlier patch.  I thought a pglist was cleaner, but it's not needed.
> 
> This updates the preliminary patches to address the issues Dean raised.
> 
> One advantage of using a pglist is that we can free it by calling
> list_free_deep(Typ), rather than looping to free each of its elements.
> But maybe for bootstrap.c it doesn't matter, and we can just write:
> | Typ = NULL; /* Leak the old Typ array */
> 

Thanks. I'll switch this in the next version of the patch series.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PoC/WIP: Extended statistics on expressions

2021-03-04 Thread Justin Pryzby
On Mon, Jan 04, 2021 at 09:45:24AM -0600, Justin Pryzby wrote:
> On Mon, Jan 04, 2021 at 03:34:08PM +, Dean Rasheed wrote:
> > * I'm not sure I understand the need for 0001. Wasn't there an earlier
> > version of this patch that just did it by re-populating the type
> > array, but which still had it as an array rather than turning it into
> > a list? Making it a list falsifies some of the comments and
> > function/variable name choices in that file.
> 
> This part is from me.
> 
> I can review the names if it's desired , but it'd be fine to fall back to the
> earlier patch.  I thought a pglist was cleaner, but it's not needed.

This updates the preliminary patches to address the issues Dean raised.

One advantage of using a pglist is that we can free it by calling
list_free_deep(Typ), rather than looping to free each of its elements.
But maybe for bootstrap.c it doesn't matter, and we can just write:
| Typ = NULL; /* Leak the old Typ array */

-- 
Justin
>From 41ec12096cefc00484e7f2a0b3bfbc0f79cdd162 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 19 Nov 2020 20:48:48 -0600
Subject: [PATCH 1/2] bootstrap: convert Typ to a List*

---
 src/backend/bootstrap/bootstrap.c | 89 ++-
 1 file changed, 41 insertions(+), 48 deletions(-)

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 6f615e6622..1b940d9d27 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -58,7 +58,7 @@ static void BootstrapModeMain(void);
 static void bootstrap_signals(void);
 static void ShutdownAuxiliaryProcess(int code, Datum arg);
 static Form_pg_attribute AllocateAttribute(void);
-static void populate_typ_array(void);
+static void populate_typ(void);
 static Oid	gettype(char *type);
 static void cleanup(void);
 
@@ -159,7 +159,7 @@ struct typmap
 	FormData_pg_type am_typ;
 };
 
-static struct typmap **Typ = NULL;
+static List *Typ = NIL; /* List of struct typmap* */
 static struct typmap *Ap = NULL;
 
 static Datum values[MAXATTR];	/* current row's attribute values */
@@ -595,10 +595,10 @@ boot_openrel(char *relname)
 
 	/*
 	 * pg_type must be filled before any OPEN command is executed, hence we
-	 * can now populate the Typ array if we haven't yet.
+	 * can now populate Typ if we haven't yet.
 	 */
-	if (Typ == NULL)
-		populate_typ_array();
+	if (Typ == NIL)
+		populate_typ();
 
 	if (boot_reldesc != NULL)
 		closerel(NULL);
@@ -688,7 +688,7 @@ DefineAttr(char *name, char *type, int attnum, int nullness)
 
 	typeoid = gettype(type);
 
-	if (Typ != NULL)
+	if (Typ != NIL)
 	{
 		attrtypes[attnum]->atttypid = Ap->am_oid;
 		attrtypes[attnum]->attlen = Ap->am_typ.typlen;
@@ -866,47 +866,36 @@ cleanup(void)
 }
 
 /* 
- *		populate_typ_array
+ *		populate_typ
  *
- * Load the Typ array by reading pg_type.
+ * Load Typ by reading pg_type.
  * 
  */
 static void
-populate_typ_array(void)
+populate_typ(void)
 {
 	Relation	rel;
 	TableScanDesc scan;
 	HeapTuple	tup;
-	int			nalloc;
-	int			i;
-
-	Assert(Typ == NULL);
+	MemoryContext old;
 
-	nalloc = 512;
-	Typ = (struct typmap **)
-		MemoryContextAlloc(TopMemoryContext, nalloc * sizeof(struct typmap *));
+	Assert(Typ == NIL);
 
 	rel = table_open(TypeRelationId, NoLock);
 	scan = table_beginscan_catalog(rel, 0, NULL);
-	i = 0;
+	old = MemoryContextSwitchTo(TopMemoryContext);
 	while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
 	{
 		Form_pg_type typForm = (Form_pg_type) GETSTRUCT(tup);
+		struct typmap *newtyp;
 
-		/* make sure there will be room for a trailing NULL pointer */
-		if (i >= nalloc - 1)
-		{
-			nalloc *= 2;
-			Typ = (struct typmap **)
-repalloc(Typ, nalloc * sizeof(struct typmap *));
-		}
-		Typ[i] = (struct typmap *)
-			MemoryContextAlloc(TopMemoryContext, sizeof(struct typmap));
-		Typ[i]->am_oid = typForm->oid;
-		memcpy(&(Typ[i]->am_typ), typForm, sizeof(Typ[i]->am_typ));
-		i++;
+		newtyp = (struct typmap *) palloc(sizeof(struct typmap));
+		Typ = lappend(Typ, newtyp);
+
+		newtyp->am_oid = typForm->oid;
+		memcpy(>am_typ, typForm, sizeof(newtyp->am_typ));
 	}
-	Typ[i] = NULL;/* Fill trailing NULL pointer */
+	MemoryContextSwitchTo(old);
 	table_endscan(scan);
 	table_close(rel, NoLock);
 }
@@ -916,25 +905,26 @@ populate_typ_array(void)
  *
  * NB: this is really ugly; it will return an integer index into TypInfo[],
  * and not an OID at all, until the first reference to a type not known in
- * TypInfo[].  At that point it will read and cache pg_type in the Typ array,
+ * TypInfo[].  At that point it will read and cache pg_type in Typ,
  * and subsequently return a real OID (and set the global pointer Ap to
  * point at the found row in Typ).  So caller must check whether Typ is
- * still NULL to determine what the return value is!
+ * still NIL to determine what the return value is!
  * 
  */
 static Oid
 gettype(char *type)
 {
-	if (Typ != NULL)
+	if (Typ != NIL)
 	{
-		struct typmap 

Re: Extensibility of the PostgreSQL wire protocol

2021-03-04 Thread Hannu Krosing
On Thu, Mar 4, 2021 at 9:55 PM Jan Wieck  wrote:
>
> Another possibility, and this is what is being used by the AWS team
> implementing the TDS protocol for Babelfish, is to completely replace
> the entire TCOP mainloop function PostgresMain().

I suspect this is the only reasonable way to do it for protocols which are
not very close to libpq.

> That is of course a
> rather drastic move and requires a lot more coding on the extension
> side,

Not necessarily - if the new protocol is close to existing one, then it is
copy/paste + some changes.

If it is radically different, then trying to fit it into the current
mainloop will
be even harder than writing from scratch.

And will very likely fail in the end anyway :)

> but the whole thing was developed that way from the beginning and
> it is working. I don't have a definitive date when that code will be
> presented. Kuntal or Prateek may be able to fill in more details.

Are you really fully replacing the main loop, or are you running a second
main loop in parallel in the same database server instance, perhaps as
a separate TDS_postmaster backend ?

Will the data still also be accessible "as postgres" via port 5432 when
TDS/SQLServer support is active ?




Re: macOS SIP, next try

2021-03-04 Thread Tom Lane
Peter Eisentraut  writes:
> On 01.03.21 15:44, Tom Lane wrote:
>> Peter Eisentraut  writes:
>>> I have since learned that there is a way to disable only the part of SIP
>>> that is relevant for us.  This seems like a useful compromise, and it
>>> appears that a number of other open-source projects are following the
>>> same route.  I suggest the attached documentation patch and then close
>>> this issue.

>> Hmm, interesting.  Where is it documented what this does?

> Not really documented AFAICT, but here is a source: 
> https://developer.apple.com/forums/thread/17452

Hmm.  So I tried this, ie "csrutil enable --without debug" in the
recovery system, and after rebooting what I see is

$ csrutil status
System Integrity Protection status: unknown (Custom Configuration).

Configuration:
Apple Internal: disabled
Kext Signing: enabled
Filesystem Protections: disabled
Debugging Restrictions: enabled
DTrace Restrictions: enabled
NVRAM Protections: enabled
BaseSystem Verification: enabled

This is an unsupported configuration, likely to break in the future and leave 
your machine in an unknown state.
$ 

which is, shall we say, not the set of options the command appeared
to select.  It does work, in the sense that "make check" is able
to complete without having an installation tree.  But really, Apple
is doing their level best to hang a "here be dragons" sign on this.
I'm not comfortable with recommending it, and I'm about to go
turn it off again, because I have no damn idea what it really does.

regards, tom lane




Re: PROXY protocol support

2021-03-04 Thread Álvaro Hernández



On 5/3/21 0:21, Jacob Champion wrote:
> On Thu, 2021-03-04 at 21:45 +0100, Magnus Hagander wrote:
>> On Thu, Mar 4, 2021 at 9:07 PM Jacob Champion  wrote:
>>> Idle thought I had while setting up a local test rig: Are there any
>>> compelling cases for allowing PROXY packets to arrive over Unix
>>> sockets? (By which I mean, the proxy is running on the same machine as
>>> Postgres, and connects to it using the .s.PGSQL socket file instead of
>>> TCP.) Are there cases where you want some other software to interact
>>> with the TCP stack instead of Postgres, but it'd still be nice to have
>>> the original connection information available?
>> I'm uncertain what that usecase would be for something like haproxy,
>> tbh. It can't do connection pooling, so adding it on the same machine
>> as postgres itself wouldn't really add anything, I think?
> Yeah, I wasn't thinking HAproxy so much as some unspecified software
> appliance that's performing Some Task before allowing a TCP client to
> speak to Postgres. But it'd be better to hear from someone that has an
> actual use case, instead of me spitballing.

    Here's a use case: Envoy's Postgres filter (see [1], [2]). Right now
is able to capture protocol-level metrics and send them to a metrics
collector (eg. Prometheus) while proxying the traffic. More capabilities
are being added as of today, and will eventually manage HBA too. It
would greatly benefit from this proposal, since it proxies the traffic
with, obviously, its IP, not the client's. It may be used (we do)
locally fronting Postgres, via UDS (so it can be easily trusted).


    Álvaro


[1]
https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/network_filters/postgres_proxy_filter
[2]
https://www.cncf.io/blog/2020/08/13/envoy-1-15-introduces-a-new-postgres-extension-with-monitoring-support/

-- 

Alvaro Hernandez


---
OnGres






Re: WIP: BRIN multi-range indexes

2021-03-04 Thread Tomas Vondra
On 3/4/21 5:30 PM, John Naylor wrote:
> 
> On Tue, Mar 2, 2021 at 7:32 PM Tomas Vondra
> mailto:tomas.von...@enterprisedb.com>>
> wrote:
>> Ummm, in brin_minmax_multi_distance_timetz()? I don't think timetz can
>> be infinite, no? I think brin_minmax_multi_distance_date you meant?
> 
> In timestamp.c there are plenty of checks for TIMESTAMP_NOT_FINITE
> for TimestampTz types and I assumed that was applicable here.
> 

I don't think so. The NOT_FINITE macros are defined only for timestamps
and dates, not for TimeTzADT. So I think the current code is correct.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PROXY protocol support

2021-03-04 Thread Hannu Krosing
The current proposal seems to miss the case of transaction pooling
(and statement pooling) where the same established connection
multiplexes transactions / statements from multiple remote clients.

What we would need for that case would be a functionl

pg_set_remote_client_address( be_key, remote_ip, remote_hostname)

where only be_key and remote_ip are required, but any string (up to a
certain length) would be accepted as hostname.

It would be really nice if we could send this request at protocol level but
if that is hard to do then having a function would get us half way there.

the be_key in the function is the key from PGcancel, which is stored
 by libpq when making the connection, and it is there, to make sure
that only the directly connecting proxy can successfully call the function.

Cheers
Hannu

On Fri, Mar 5, 2021 at 12:21 AM Jacob Champion  wrote:
>
> On Thu, 2021-03-04 at 21:45 +0100, Magnus Hagander wrote:
> > On Thu, Mar 4, 2021 at 9:07 PM Jacob Champion  wrote:
> > > Idle thought I had while setting up a local test rig: Are there any
> > > compelling cases for allowing PROXY packets to arrive over Unix
> > > sockets? (By which I mean, the proxy is running on the same machine as
> > > Postgres, and connects to it using the .s.PGSQL socket file instead of
> > > TCP.) Are there cases where you want some other software to interact
> > > with the TCP stack instead of Postgres, but it'd still be nice to have
> > > the original connection information available?
> >
> > I'm uncertain what that usecase would be for something like haproxy,
> > tbh. It can't do connection pooling, so adding it on the same machine
> > as postgres itself wouldn't really add anything, I think?
>
> Yeah, I wasn't thinking HAproxy so much as some unspecified software
> appliance that's performing Some Task before allowing a TCP client to
> speak to Postgres. But it'd be better to hear from someone that has an
> actual use case, instead of me spitballing.
>
> > Iid think about the other end, if you had a proxy on a different
> > machine accepting unix connections and passing them on over
> > PROXY-over-tcp. But I doubt it's useful to know it was unix in that
> > case  (since it still couldn't do peer or such for the auth) --
> > instead, that seems like an argument where it'd be better to proxy
> > without using PROXY and just letting the IP address be.
>
> You could potentially design a system that lets you proxy a "local all
> all trust" setup from a different (trusted) machine, without having to
> actually let people onto the machine that's running Postgres. That
> would require some additional authentication on the PROXY connection
> (i.e. something stronger than host-based auth) to actually be useful.
>
> -- other notes --
>
> A small nitpick on the current separate-port PoC is that I'm forced to
> set up a "regular" TCP port, even if I only want the PROXY behavior.
>
> The original-host logging isn't working for me:
>
>   WARNING:  pg_getnameinfo_all() failed: ai_family not supported
>   LOG:  proxy connection from: host=??? port=???
>
> and I think the culprit is this:
>
> >/* Store a copy of the original address, for logging */
> >memcpy(_save, >raddr, port->raddr.salen);
>
> port->raddr.salen is the length of port->raddr.addr; we want the length
> of the copy to be sizeof(port->raddr) here, no?
>
> --Jacob




Re: PITR promote bug: Checkpointer writes to older timeline

2021-03-04 Thread Soumyadeep Chakraborty
Hey all,

I took a stab at a quick and dirty TAP test (my first ever). So it
can probably be improved a lot. Please take a look.

On Thu, Mar 04, 2021 at 10:28:31AM +0900, Kyotaro Horiguchi wrote:

> 2. Restore ThisTimeLineID after calling XLogReadRecord() in the
>   *caller* side.  This is what came up to me first but I don't like
>   this, too, but I don't find better fix.  way.

+1 to this patch [1].
The above TAP test passes with this patch applied.

[1] 
https://www.postgresql.org/message-id/attachment/119972/dont_change_thistimelineid.patch

Regards,
Soumyadeep

Regards,
Soumyadeep
diff --git a/src/test/recovery/t/022_pitr_prepared_xact.pl b/src/test/recovery/t/022_pitr_prepared_xact.pl
new file mode 100644
index 000..b8d5146bb9d
--- /dev/null
+++ b/src/test/recovery/t/022_pitr_prepared_xact.pl
@@ -0,0 +1,69 @@
+# Test for timeline switch
+use strict;
+use warnings;
+use File::Path qw(rmtree);
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+use File::Compare;
+use Time::HiRes qw(usleep);
+
+$ENV{PGDATABASE} = 'postgres';
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+$node_primary->init(has_archiving => 1);
+$node_primary->append_conf('postgresql.conf', "max_prepared_transactions = 10");
+$node_primary->append_conf('postgresql.conf', "max_wal_senders = 10");
+$node_primary->append_conf('postgresql.conf', "wal_level = 'replica'");
+$node_primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+my $node_pitr = get_new_node('node_pitr');
+$node_pitr->init_from_backup($node_primary, $backup_name,
+standby => 0, has_restoring => 1);
+$node_pitr->append_conf('postgresql.conf', "max_prepared_transactions = 10");
+$node_pitr->append_conf('postgresql.conf', "recovery_target_name = 'rp'");
+$node_pitr->append_conf('postgresql.conf', "recovery_target_action = 'promote'");
+
+# Workload with prepared transaction
+$node_primary->psql(
+'postgres', qq{
+CREATE TABLE foo(i int);
+BEGIN;
+INSERT INTO foo VALUES(1);
+PREPARE TRANSACTION 'fooinsert';
+SELECT pg_create_restore_point('rp');
+INSERT INTO foo VALUES(2);
+SELECT pg_switch_wal();
+});
+
+# Sleep 5s to ensure that the WAL has been archived.
+# probably can be replaced by a wait
+usleep(500);
+
+$node_pitr->start;
+
+my $last_archived_wal_file_name = $node_primary->safe_psql('postgres',
+"SELECT last_archived_wal FROM pg_stat_archiver;");
+
+# Ensure that we don't write to the older timeline during PITR promotion by
+# ensuring that the last archived WAL file was not overwritten during recovery.
+my $archive_dir = $node_primary->archive_dir;
+my $archive_wal_file_path = "${archive_dir}/${last_archived_wal_file_name}";
+my $node_pitr_data = $node_pitr->data_dir;
+my $local_wal_file_path = "${node_pitr_data}/pg_wal/${last_archived_wal_file_name}";
+is(compare($archive_wal_file_path, $local_wal_file_path), qq(0), "Check if the last archived WAL file was overwritten");
+
+$node_pitr->psql(
+'postgres', qq{
+COMMIT PREPARED 'fooinsert';
+});
+
+my $foo_select_result = $node_pitr->safe_psql('postgres',
+"SELECT * FROM foo;");
+
+is($foo_select_result, qq(1), "Check foo select result");


Re: About to add WAL write/fsync statistics to pg_stat_wal view

2021-03-04 Thread Masahiro Ikeda

On 2021-03-05 01:02, Fujii Masao wrote:

On 2021/03/04 16:14, Masahiro Ikeda wrote:

On 2021-03-03 20:27, Masahiro Ikeda wrote:

On 2021-03-03 16:30, Fujii Masao wrote:

On 2021/03/03 14:33, Masahiro Ikeda wrote:

On 2021-02-24 16:14, Fujii Masao wrote:

On 2021/02/15 11:59, Masahiro Ikeda wrote:

On 2021-02-10 00:51, David G. Johnston wrote:

On Thu, Feb 4, 2021 at 4:45 PM Masahiro Ikeda
 wrote:


I pgindented the patches.


... XLogWrite, which is invoked during an
XLogFlush request (see ...).  This is also
incremented by the WAL receiver during replication.

("which normally called" should be "which is normally called" or
"which normally is called" if you want to keep true to the 
original)
You missed the adding the space before an opening parenthesis 
here and

elsewhere (probably copy-paste)

is ether -> is either
"This parameter is off by default as it will repeatedly query 
the

operating system..."
", because" -> "as"


Thanks, I fixed them.

wal_write_time and the sync items also need the note: "This is 
also

incremented by the WAL receiver during replication."


I skipped changing it since I separated the stats for the WAL 
receiver

in pg_stat_wal_receiver.

"The number of times it happened..." -> " (the tally of this 
event is
reported in wal_buffers_full in) This is undesirable because 
..."


Thanks, I fixed it.

I notice that the patch for WAL receiver doesn't require 
explicitly
computing the sync statistics but does require computing the 
write
statistics.  This is because of the presence of issue_xlog_fsync 
but
absence of an equivalent pg_xlog_pwrite.  Additionally, I 
observe that
the XLogWrite code path calls pgstat_report_wait_*() while the 
WAL
receiver path does not.  It seems technically straight-forward 
to
refactor here to avoid the almost-duplicated logic in the two 
places,

though I suspect there may be a trade-off for not adding another
function call to the stack given the importance of WAL 
processing

(though that seems marginalized compared to the cost of actually
writing the WAL).  Or, as Fujii noted, go the other way and 
don't have
any shared code between the two but instead implement the WAL 
receiver

one to use pg_stat_wal_receiver instead.  In either case, this
half-and-half implementation seems undesirable.


OK, as Fujii-san mentioned, I separated the WAL receiver stats.
(v10-0002-Makes-the-wal-receiver-report-WAL-statistics.patch)


Thanks for updating the patches!


I added the infrastructure code to communicate the WAL receiver 
stats messages between the WAL receiver and the stats collector, 
and

the stats for WAL receiver is counted in pg_stat_wal_receiver.
What do you think?


On second thought, this idea seems not good. Because those stats 
are

collected between multiple walreceivers, but other values in
pg_stat_wal_receiver is only related to the walreceiver process 
running
at that moment. IOW, it seems strange that some values show 
dynamic

stats and the others show collected stats, even though they are in
the same view pg_stat_wal_receiver. Thought?


OK, I fixed it.
The stats collected in the WAL receiver is exposed in pg_stat_wal 
view in v11 patch.


Thanks for updating the patches! I'm now reading 001 patch.

+    /* Check whether the WAL file was synced to disk right now */
+    if (enableFsync &&
+    (sync_method == SYNC_METHOD_FSYNC ||
+ sync_method == SYNC_METHOD_FSYNC_WRITETHROUGH ||
+ sync_method == SYNC_METHOD_FDATASYNC))
+    {

Isn't it better to make issue_xlog_fsync() return immediately
if enableFsync is off, sync_method is open_sync or open_data_sync,
to simplify the code more?


Thanks for the comments.
I added the above code in v12 patch.



+    /*
+ * Send WAL statistics only if WalWriterDelay has elapsed 
to minimize

+ * the overhead in WAL-writing.
+ */
+    if (rc & WL_TIMEOUT)
+    pgstat_send_wal();

On second thought, this change means that it always takes 
wal_writer_delay
before walwriter's WAL stats is sent after XLogBackgroundFlush() is 
called.
For example, if wal_writer_delay is set to several seconds, some 
values in

pg_stat_wal would be not up-to-date meaninglessly for those seconds.
So I'm thinking to withdraw my previous comment and it's ok to send
the stats every after XLogBackgroundFlush() is called. Thought?


Thanks, I didn't notice that.

Although PGSTAT_STAT_INTERVAL is 500msec, wal_writer_delay's
default value is 200msec and it may be set shorter time.


Yeah, if wal_writer_delay is set to very small value, there is a risk
that the WAL stats are sent too frequently. I agree that's a problem.



Why don't to make another way to check the timestamp?

+   /*
+    * Don't send a message unless it's been at least
PGSTAT_STAT_INTERVAL
+    * msec since we last sent one
+    */
+   now = GetCurrentTimestamp();
+   if (TimestampDifferenceExceeds(last_report, now,

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-04 Thread Greg Nancarrow
On Thu, Mar 4, 2021 at 11:05 PM Amit Kapila  wrote:
>
> On Thu, Mar 4, 2021 at 5:24 PM Greg Nancarrow  wrote:
> >
> > On Thu, Mar 4, 2021 at 10:07 PM Amit Kapila  wrote:
> > >
> >
> > >Also, in
> > > standard_planner, we should add these partitionOids only for
> > > parallel-mode.
> > >
> >
> > It is doing that in v20 patch (what makes you think it isn't?).
> >
>
> The below code snippet:
> + /* For AcquireExecutorLocks(). */
> + if (IsModifySupportedInParallelMode(parse->commandType))
> + result->partitionOids = glob->partitionOids;
>
> I understand that you have a check for the parallel mode in
> AcquireExecutorLocks but why can't we have it before adding that to
> planned statement
>

OK, I think I'm on the same wavelength now (sorry, I didn't realise
you're talking about PlannedStmt).

What  I believe you're suggesting is in the planner where
partitionOids are "added" to the returned PlannedStmt, they should
only be added if glob->parallelModeNeeded is true:.

i.e.

/* For AcquireExecutorLocks(). */
if (glob->partitionOids != NIL && glob->parallelModeNeeded)
result->partitionOids = glob->partitionOids;

(seems reasonable to me, as then it will match the condition for which
glob->partitionOids are added to glob->relationOids)

then in plancache.c the check on parallelModeNeeded can be removed:

/* Lock partitions ahead of modifying them in parallel mode. */
if (rti == resultRelation &&
plannedstmt->partitionOids != NIL)
AcquireExecutorLocksOnPartitions(plannedstmt->partitionOids,
 rte->rellockmode, acquire);

Let me know if this matches what you were thinking.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: PROXY protocol support

2021-03-04 Thread Jacob Champion
On Thu, 2021-03-04 at 21:45 +0100, Magnus Hagander wrote:
> On Thu, Mar 4, 2021 at 9:07 PM Jacob Champion  wrote:
> > Idle thought I had while setting up a local test rig: Are there any
> > compelling cases for allowing PROXY packets to arrive over Unix
> > sockets? (By which I mean, the proxy is running on the same machine as
> > Postgres, and connects to it using the .s.PGSQL socket file instead of
> > TCP.) Are there cases where you want some other software to interact
> > with the TCP stack instead of Postgres, but it'd still be nice to have
> > the original connection information available?
> 
> I'm uncertain what that usecase would be for something like haproxy,
> tbh. It can't do connection pooling, so adding it on the same machine
> as postgres itself wouldn't really add anything, I think?

Yeah, I wasn't thinking HAproxy so much as some unspecified software
appliance that's performing Some Task before allowing a TCP client to
speak to Postgres. But it'd be better to hear from someone that has an
actual use case, instead of me spitballing.

> Iid think about the other end, if you had a proxy on a different
> machine accepting unix connections and passing them on over
> PROXY-over-tcp. But I doubt it's useful to know it was unix in that
> case  (since it still couldn't do peer or such for the auth) --
> instead, that seems like an argument where it'd be better to proxy
> without using PROXY and just letting the IP address be.

You could potentially design a system that lets you proxy a "local all
all trust" setup from a different (trusted) machine, without having to
actually let people onto the machine that's running Postgres. That
would require some additional authentication on the PROXY connection
(i.e. something stronger than host-based auth) to actually be useful.

-- other notes --

A small nitpick on the current separate-port PoC is that I'm forced to
set up a "regular" TCP port, even if I only want the PROXY behavior.

The original-host logging isn't working for me:

  WARNING:  pg_getnameinfo_all() failed: ai_family not supported
  LOG:  proxy connection from: host=??? port=???

and I think the culprit is this:

>/* Store a copy of the original address, for logging */
>   
>memcpy(_save, >raddr, port->raddr.salen);

port->raddr.salen is the length of port->raddr.addr; we want the length
of the copy to be sizeof(port->raddr) here, no?

--Jacob


Re: Replace buffer I/O locks with condition variables (reviving an old patch)

2021-03-04 Thread Thomas Munro
On Fri, Feb 26, 2021 at 7:08 PM Thomas Munro  wrote:
> Back in 2016, Robert Haas proposed to replace I/O locks with condition
> variables[1].  Condition variables went in and have found lots of
> uses, but this patch to replace a bunch of LWLocks and some busy
> looping did not.  Since then, it has been tested quite a lot as part
> of the AIO project[2], which currently depends on it.  That's why I'm
> interested in following up now.  I asked Robert if he planned to
> re-propose it and he said I should go for it, so... here I go.

I removed a redundant (Size) cast, fixed the wait event name and
category (WAIT_EVENT_BUFFILE_XXX is for buffile.c stuff, not bufmgr.c
stuff, and this is really an IPC wait, not an IO wait despite the
name), updated documentation and pgindented.
From cb4f3c943c47bb09864723c22cc0504c54dc9a3a Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 26 Feb 2021 18:01:28 +1300
Subject: [PATCH v2] Replace buffer I/O locks with condition variables.

1.  Backends waiting for buffer I/O are now interruptible.

2.  If something goes wrong in a backend that is currently performing
I/O, other backends no longer wake up until that backend reaches
AbortBufferIO() and broadcasts on the CV.  Previously, any waiters would
wake up (because the I/O lock was automatically released) and then
busy-loop until AbortBufferIO() cleared BM_IO_IN_PROGRESS.

Author: Robert Haas 
Reviewed-by: Tom Lane  (2016)
Reviewed-by: Thomas Munro 
Discussion: https://postgr.es/m/CA%2BhUKGJ8nBFrjLuCTuqKN0pd2PQOwj9b_jnsiGFFMDvUxahj_A%40mail.gmail.com
Discussion: https://postgr.es/m/CA+Tgmoaj2aPti0yho7FeEf2qt-JgQPRWb0gci_o1Hfr=c56...@mail.gmail.com
---
 doc/src/sgml/monitoring.sgml |  8 +--
 src/backend/postmaster/pgstat.c  |  3 ++
 src/backend/storage/buffer/buf_init.c| 19 ---
 src/backend/storage/buffer/bufmgr.c  | 64 +++-
 src/backend/storage/lmgr/lwlock.c|  2 -
 src/include/pgstat.h |  1 +
 src/include/storage/buf_internals.h  |  7 +--
 src/include/storage/condition_variable.h | 11 
 src/include/storage/lwlock.h |  1 -
 src/tools/pgindent/typedefs.list |  1 +
 10 files changed, 51 insertions(+), 66 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..b37d6484a4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1581,6 +1581,10 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   Waiting for the page number needed to continue a parallel B-tree
scan to become available.
  
+ 
+  BufferWaitIO
+  Waiting for buffer I/O to complete.
+ 
  
   CheckpointDone
   Waiting for a checkpoint to complete.
@@ -1871,10 +1875,6 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   BufferContent
   Waiting to access a data page in memory.
  
- 
-  BufferIO
-  Waiting for I/O on a data page.
- 
  
   BufferMapping
   Waiting to associate a data block with a buffer in the buffer
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index f75b52719d..26cd2ce196 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4011,6 +4011,9 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_BTREE_PAGE:
 			event_name = "BtreePage";
 			break;
+		case WAIT_EVENT_BUFFER_WAIT_IO:
+			event_name = "BufferWaitIO";
+			break;
 		case WAIT_EVENT_CHECKPOINT_DONE:
 			event_name = "CheckpointDone";
 			break;
diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index e9e4f35bb5..51b250fe16 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -19,7 +19,7 @@
 
 BufferDescPadded *BufferDescriptors;
 char	   *BufferBlocks;
-LWLockMinimallyPadded *BufferIOLWLockArray = NULL;
+ConditionVariableMinimallyPadded *BufferIOCVArray = NULL;
 WritebackContext BackendWritebackContext;
 CkptSortItem *CkptBufferIds;
 
@@ -68,7 +68,7 @@ InitBufferPool(void)
 {
 	bool		foundBufs,
 foundDescs,
-foundIOLocks,
+foundIOCV,
 foundBufCkpt;
 
 	/* Align descriptors to a cacheline boundary. */
@@ -82,10 +82,10 @@ InitBufferPool(void)
 		NBuffers * (Size) BLCKSZ, );
 
 	/* Align lwlocks to cacheline boundary */
-	BufferIOLWLockArray = (LWLockMinimallyPadded *)
-		ShmemInitStruct("Buffer IO Locks",
-		NBuffers * (Size) sizeof(LWLockMinimallyPadded),
-		);
+	BufferIOCVArray = (ConditionVariableMinimallyPadded *)
+		ShmemInitStruct("Buffer IO Condition Variables",
+		NBuffers * sizeof(ConditionVariableMinimallyPadded),
+		);
 
 	/*
 	 * The array used to sort to-be-checkpointed buffer ids is located in
@@ -98,10 +98,10 @@ InitBufferPool(void)
 		ShmemInitStruct("Checkpoint BufferIds",
 		NBuffers * sizeof(CkptSortItem), );
 
-	if (foundDescs || foundBufs || foundIOLocks 

Re: PROXY protocol support

2021-03-04 Thread Tatsuo Ishii
>> On Thu, 2021-03-04 at 10:42 +0900, Tatsuo Ishii wrote:
>> > Is there any formal specification for the "a protocol common and very
>> > light weight in proxies"?
>>
>> See
>>
>> https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
>
> Yeah, it's currently in one of the comments, but should probably be
> added to the docs side as well.

It seems the protocol is HAproxy product specific and I think it would
be better to be mentioned in the docs.

> And yes tests :) Probably not a regression test, but some level of tap
> testing should definitely be added. We'll just have to find a way to
> do that without making haproxy a dependency to run the tests :)

Agreed.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Wired if-statement in gen_partprune_steps_internal

2021-03-04 Thread Ryan Lambert
On Wed, Mar 3, 2021 at 11:03 PM Amit Langote 
wrote:

> Sorry, this seems to have totally slipped my mind.
>
> Attached is the patch I had promised.
>
> Also, I have updated the title of the CF entry to "Some cosmetic
> improvements of partition pruning code", which I think is more
> appropriate.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com


Thank you.  The updated patch passes installcheck-world.  I ran a handful
of test queries with a small number of partitions and observed the same
plans before and after the patch. I cannot speak to the quality of the
code, though am happy to test any additional use cases that should be
verified.


Ryan Lambert


Re: libpq debug log

2021-03-04 Thread Tom Lane
"tsunakawa.ta...@fujitsu.com"  writes:
> From: Kyotaro Horiguchi 
>> Using (inCursor - inStart) as logCursor doesn't work correctly if tracing 
>> state
>> desyncs.  Once desync happens inStart can be moved at the timing that the
>> tracing code doesn't expect. This requires (as I mentioned upthread)
>> pqReadData to actively reset logCursor, though.
>> 
>> logCursor should move when bytes are fed to the tracing functoins even when
>> theyare the last bytes of a message.

> Hmm, sharp and nice insight.  I'd like Iwata-san and Kirk to consider this, 
> including Horiguchi-san's previous mails.

I took a quick look through the v22 patch, and TBH I don't like much of
anything at all about the proposed architecture.  It's retained most
of the flavor of the way it was done before, which was a hangover from
the old process-on-the-fly scheme.

I think the right way to do this, now that we've killed off v2 protocol
support, is to rely on the fact that libpq fully buffers both incoming
and outgoing messages.  We should nuke every last bit of the existing
code that intersperses tracing logic with normal message decoding and
construction, and instead have two tracing entry points:

(1) Log a received message.  This is called as soon as we know (from
the length word) that we have the whole message available, and it's
just passed a pointer into the input buffer.  It should examine the
input message for itself and print the details.

(2) Log a message-to-be-sent.  Again, call it as soon as we've constructed
a complete message in the output buffer, and let it examine and print
that message for itself.

Both of these pieces of logic could be written directly from the protocol
specification, without any interaction with the main libpq code paths,
which would be a Good Thing IMO.  The current intertwined approach is
somewhere between useless and counterproductive if you're in need of
tracing down a libpq bug.  (In other words, I'm suggesting that the
apparent need for duplicate logic would be good not bad, and indeed that
it'd be best to write the tracing logic without consulting the existing
libpq code at all.)

This would, I think, also eliminate the need for extra storage to hold
info about bits of the message, which IMO is a pretty significant
downside of the proposed design.  The printing logic would just print
directly to Pfdebug; it wouldn't have to accumulate anything, and so
it wouldn't have the out-of-memory failure modes that this patch has.
You could also get rid of messy stuff like pqTraceForciblyCloseBeMsg.

Lastly, this'd reduce the overhead the tracing functionality imposes
on normal usage to about nil.  Admittedly, all those "if (Pfdebug)"
tests probably don't cost that much execution-wise, but they cost
something.  Making only one such test per sent or received message
has to be better.

regards, tom lane




Re: Disallow SSL compression?

2021-03-04 Thread Daniel Gustafsson
> On 4 Mar 2021, at 11:59, Michael Paquier  wrote:
> 
> On Wed, Mar 03, 2021 at 03:14:01PM +0100, Peter Eisentraut wrote:
>> Per your other thread, you should also remove the environment variable.
>> 
>> In postgres_fdw, I think commenting it out is not the right change.  The
>> other commented out values are still valid settings but are omitted from the
>> test for other reasons.  It's not entirely all clear, but we don't have to
>> keep obsolete stuff in there forever.
> 
> Agreed on both points.  Also, it seems a bit weird to keep around
> pg_stat_ssl.compression while we know that it will always be false.
> Wouldn't it be better to get rid of that in PgBackendSSLStatus and
> pg_stat_ssl then?

Fixed in the v4 posted just now, although I opted for keeping the column in
pg_stat_ssl for backwards compatibility with a doc note.

--
Daniel Gustafsson   https://vmware.com/





Re: SPI return

2021-03-04 Thread Tom Lane
Patrick Handja  writes:
> I am not able to get the array returned by get_tuples function, and I am
> thinking it's SPI_finish(). When I tried to print my array tuples itens
> after SPI_finish(), It is not working.

Indeed, SPI_finish() frees everything that was allocated by SPI functions,
including result tuple tables.  You need to do whatever it is you want
to do with the tuples before calling SPI_finish.

regards, tom lane




Re: Disallow SSL compression?

2021-03-04 Thread Daniel Gustafsson
> On 3 Mar 2021, at 15:14, Peter Eisentraut  
> wrote:
> 
> On 03.03.21 11:31, Daniel Gustafsson wrote:
>>> On 26 Feb 2021, at 20:34, Daniel Gustafsson  wrote:
>>> Attached is a v2 which retains the sslcompression parameter for backwards
>>> compatibility.
>> And now a v3 which fixes an oversight in postgres_fdw as well as adds an SSL
>> TAP test to cover deprecated parameters.
> 
> Per your other thread, you should also remove the environment variable.

Fixed.

> In postgres_fdw, I think commenting it out is not the right change.  The 
> other commented out values are still valid settings but are omitted from the 
> test for other reasons.  It's not entirely all clear, but we don't have to 
> keep obsolete stuff in there forever.

Ah, I didn't get that distinction but that makes sense. Fixed.

The attached version takes a step further and removes sslcompression from
pg_conn and just eats the value as there is no use in setting a dummy alue.  It
also removes compression from PgBackendSSLStatus and be_tls_get_compression as
raised by Michael downthread.  I opted for keeping the column in pg_stat_ssl
with a note in the documentation that it will be removed, for the same
backwards compatibility reason of eating the connection param without acting on
it.  This might be overthinking it however.

--
Daniel Gustafsson   https://vmware.com/



v4-0001-Disallow-SSL-compression.patch
Description: Binary data


SPI return

2021-03-04 Thread Patrick Handja
Greetings,

I have spent a couple of days working on separated lib in PostgreSql, and I
am facing some issues with the return of data using SPI (Server Programming
Interface).
I have this simple get_tuples function used in the main function foobar. My
example is very simple too (Using a previously created table a with a
single column and with some data):
SELECT  foobar('select * from a');
I am not able to get the array returned by get_tuples function, and I am
thinking it's SPI_finish(). When I tried to print my array tuples itens
after SPI_finish(), It is not working.


## My Code ##

static char**
get_tuples(char *command) {
int ret;
int8 rows;
char **tuples;

SPI_connect();
ret = SPI_exec(command, 0);
rows = SPI_processed;

tuples = palloc(sizeof(char*)*rows);

if (ret > 0 && SPI_tuptable != NULL) {
TupleDesc tupdesc = SPI_tuptable->tupdesc;
SPITupleTable *tuptable = SPI_tuptable;
uint64 j;

for (j = 0; j < rows; j++){
HeapTuple tuple = tuptable->vals[j];
int i;
for (i = 1; i <= tupdesc->natts; i++){
char *rowid;
rowid = SPI_getvalue(tuple, tupdesc, i);
tuples[j] = palloc(strlen(rowid)*sizeof(char));
tuples[j]= rowid;
}
}
}

// Printing my array to verify if I have all tuples, in fact I have all
of the
for (int i = 0; i < rows; ++i) {
   elog(INFO, "Item: %s", *(tuples + i));
}

pfree(command);
SPI_finish();
return tuples;
}

Datum
foobar (PG_FUNCTION_ARGS) {
char *command;
command = text_to_cstring(PG_GETARG_TEXT_PP(0));
get_tuples(command);

// *When I tried to do something like this, I am losing the connection
with the server (error)*
//elog(INFO, "*: %s", *(get_tuples(command) + 1));
PG_RETURN_INT64(1);
}

CREATE FUNCTION foobar(text) RETURNS int8
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE STRICT;


regards,

*Andjasubu Bungama, Patrick *


Re: pg_amcheck contrib application

2021-03-04 Thread Mark Dilger



> On Mar 4, 2021, at 2:04 PM, Peter Geoghegan  wrote:
> 
> On Thu, Mar 4, 2021 at 7:29 AM Robert Haas  wrote:
>> I think this whole approach is pretty suspect because the number of
>> blocks in the relation can increase (by relation extension) or
>> decrease (by VACUUM or TRUNCATE) between the time when we query for
>> the list of target relations and the time we get around to executing
>> any queries against them. I think it's OK to use the number of
>> relation pages for progress reporting because progress reporting is
>> only approximate anyway, but I wouldn't print them out in the progress
>> messages, and I wouldn't try to fix up the startblock and endblock
>> arguments on the basis of how long you think that relation is going to
>> be.
> 
> I don't think that the struct AmcheckOptions block fields (e.g.,
> startblock) should be of type 'long' -- that doesn't work well on
> Windows, where 'long' is only 32-bit. To be fair we already do the
> same thing elsewhere, but there is no reason to repeat those mistakes.
> (I'm rather suspicious of 'long' in general.)
> 
> I think that you could use BlockNumber + strtoul() without breaking Windows.

Fair enough.

>> There are a LOT of things that can go wrong when we go try to run
>> verify_heapam on a table. The table might have been dropped; in fact,
>> on a busy production system, such cases are likely to occur routinely
>> if DDL is common, which for many users it is. The system catalog
>> entries might be screwed up, so that the relation can't be opened.
>> There might be an unreadable page in the relation, either because the
>> OS reports an I/O error or something like that, or because checksum
>> verification fails. There are various other possibilities. We
>> shouldn't view such errors as low-level things that occur only in
>> fringe cases; this is a corruption-checking tool, and we should expect
>> that running it against messed-up databases will be common. We
>> shouldn't try to interpret the errors we get or make any big decisions
>> about them, but we should have a clear way of reporting them so that
>> the user can decide what to do.
> 
> I agree.
> 
> Your database is not supposed to be corrupt. Once your database has
> become corrupt, all bets are off -- something happened that was
> supposed to be impossible -- which seems like a good reason to be
> modest about what we think we know.
> 
> The user should always see the unvarnished truth. pg_amcheck should
> not presume to suppress errors from lower level code, except perhaps
> in well-scoped special cases.

I think Robert mistook why I was doing that.  I was thinking about a different 
usage pattern.  If somebody thinks a subset of relations have been badly 
corrupted, but doesn't know which relations those might be, they might try to 
find them with pg_amcheck, but wanting to just check the first few blocks per 
relation in order to sample the relations.  So,

  pg_amcheck --startblock=0 --endblock=9 --no-dependent-indexes

or something like that.  I don't think it's very fun to have it error out for 
each relation that doesn't have at least ten blocks, nor is it fun to have 
those relations skipped by error'ing out before checking any blocks, as they 
might be the corrupt relations you are looking for.  But using --startblock and 
--endblock for this is not a natural fit, as evidenced by how I was trying to 
"fix things up" for the user, so I'll punt on this usage until some future 
version, when I might add a sampling option.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: PROXY protocol support

2021-03-04 Thread Magnus Hagander
On Thu, Mar 4, 2021 at 10:01 PM Jan Wieck  wrote:
>
> On 3/4/21 3:40 PM, Magnus Hagander wrote:
> > On Thu, Mar 4, 2021 at 9:29 PM Jan Wieck  wrote:
> >> This looks like it would only need a few extra protocol messages to be
> >> understood by the backend. It might be possible to implement that with
> >> the loadable wire protocol extensions proposed here:
> >>
> >> https://commitfest.postgresql.org/32/3018/
> >
> > Actually the whole point of it is that it *doesn't* need any new
> > protocol messages. And that it *wraps* whatever is there, definitely
> > doesn't replace it. It should equally be wrapping whatever an
> > extension uses.
> >
> > So while the base topic is not unrelated, I don't think there is any
> > overlap between these.
>
> I might be missing something here, but isn't sending some extra,
> informational *header*, which is understood by the backend, in essence a
> protocol extension?

Bad choice of words, I guess.

The points being, there is a single packet sent ahead of the normal
stream. There are no new messages in "the postgresql protocol" or "the
febe protocol" or whatever we call it. And it doesn't change the
properties of any part of that protocol. And, importantly for the
simplicity, there is no negotiation and there are no packets going the
other way.

But sure, you can call it a protocol extension if you want. And yes,
it could probably be built on top of part of the ideas in that other
patch, but most of it would be useless (the abstraction of the listen
functionality into listen_have_free_slot/listen_add_socket would be
the big thing that could be used)

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?

2021-03-04 Thread Thomas Munro
On Thu, Mar 4, 2021 at 11:54 AM Thomas Munro  wrote:
> > I've been wondering what obscure hazards these "tombstone" (for want
> > of a better word) files guard against, besides the one described in
> > the comments for mdunlink().  I've been  thinking about various
> > schemes that can be summarised as "put the tombstones somewhere else",
> > but first... this is probably a stupid question, but what would break
> > if we just ... turned all this stuff off when wal_level is high enough
> > (as it is by default)?

The "how-to-make-it-so-that-we-don't-need-a-checkpoint" subtopic is
hereby ejected from this thead, and moved over here:
https://commitfest.postgresql.org/33/3030/




Re: pg_amcheck contrib application

2021-03-04 Thread Peter Geoghegan
On Thu, Mar 4, 2021 at 7:29 AM Robert Haas  wrote:
> I think this whole approach is pretty suspect because the number of
> blocks in the relation can increase (by relation extension) or
> decrease (by VACUUM or TRUNCATE) between the time when we query for
> the list of target relations and the time we get around to executing
> any queries against them. I think it's OK to use the number of
> relation pages for progress reporting because progress reporting is
> only approximate anyway, but I wouldn't print them out in the progress
> messages, and I wouldn't try to fix up the startblock and endblock
> arguments on the basis of how long you think that relation is going to
> be.

I don't think that the struct AmcheckOptions block fields (e.g.,
startblock) should be of type 'long' -- that doesn't work well on
Windows, where 'long' is only 32-bit. To be fair we already do the
same thing elsewhere, but there is no reason to repeat those mistakes.
(I'm rather suspicious of 'long' in general.)

I think that you could use BlockNumber + strtoul() without breaking Windows.

> There are a LOT of things that can go wrong when we go try to run
> verify_heapam on a table. The table might have been dropped; in fact,
> on a busy production system, such cases are likely to occur routinely
> if DDL is common, which for many users it is. The system catalog
> entries might be screwed up, so that the relation can't be opened.
> There might be an unreadable page in the relation, either because the
> OS reports an I/O error or something like that, or because checksum
> verification fails. There are various other possibilities. We
> shouldn't view such errors as low-level things that occur only in
> fringe cases; this is a corruption-checking tool, and we should expect
> that running it against messed-up databases will be common. We
> shouldn't try to interpret the errors we get or make any big decisions
> about them, but we should have a clear way of reporting them so that
> the user can decide what to do.

I agree.

Your database is not supposed to be corrupt. Once your database has
become corrupt, all bets are off -- something happened that was
supposed to be impossible -- which seems like a good reason to be
modest about what we think we know.

The user should always see the unvarnished truth. pg_amcheck should
not presume to suppress errors from lower level code, except perhaps
in well-scoped special cases.

-- 
Peter Geoghegan




Make relfile tombstone files conditional on WAL level

2021-03-04 Thread Thomas Munro
Hi,

I'm starting a new thread for this patch that originated as a
side-discussion in [1], to give it its own CF entry in the next cycle.
This is a WIP with an open question to research: what could actually
break if we did this?

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg%40mail.gmail.com
From 61a15ed286a1fd824b4e2b4b689cbe6688930e6e Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 2 Mar 2021 16:09:51 +1300
Subject: [PATCH] Make relfile tombstone files conditional on WAL level.

Traditionally we have left behind an empty file to be unlinked after the
next checkpoint, when dropping a relation.  That would prevent
GetNewRelFileNode() from recycling the relfile, avoiding a schedule
that could corrupt data in crash recovery, with wal_level=minimal.

Since the default wal_level changed to replica in release 10, and since
this mechanism introduces costs elsewhere, let's only do it if we have
to.  In particular, this avoids the need for DROP TABLESPACE for force
an expensive checkpoint just to clear out tombstone files.

XXX What would break if we did this?

Discussion: https://postgr.es/m/CA%2BhUKGLT3zibuLkn_j9xiPWn6hxH9Br-TsJoSaFgQOpxpEUnPQ%40mail.gmail.com
---
 src/backend/commands/tablespace.c | 14 +++---
 src/backend/storage/smgr/md.c | 10 ++
 src/include/access/xlog.h |  6 ++
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 69ea155d50..c1d12f3d19 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -512,8 +512,10 @@ DropTableSpace(DropTableSpaceStmt *stmt)
 	 */
 	if (!destroy_tablespace_directories(tablespaceoid, false))
 	{
+		bool		try_again = false;
+
 		/*
-		 * Not all files deleted?  However, there can be lingering empty files
+		 * Not all files deleted?  However, there can be lingering tombstones
 		 * in the directories, left behind by for example DROP TABLE, that
 		 * have been scheduled for deletion at next checkpoint (see comments
 		 * in mdunlink() for details).  We could just delete them immediately,
@@ -528,8 +530,14 @@ DropTableSpace(DropTableSpaceStmt *stmt)
 		 * TABLESPACE should not give up on the tablespace becoming empty
 		 * until all relevant invalidation processing is complete.
 		 */
-		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
-		if (!destroy_tablespace_directories(tablespaceoid, false))
+
+		if (XLogNeedRelFileTombstones())
+		{
+			RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
+			try_again = true;
+		}
+
+		if (!try_again || !destroy_tablespace_directories(tablespaceoid, false))
 		{
 			/* Still not empty, the files must be important then */
 			ereport(ERROR,
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1e12cfad8e..447122519e 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -236,8 +236,9 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
  * forkNum can be a fork number to delete a specific fork, or InvalidForkNumber
  * to delete all forks.
  *
- * For regular relations, we don't unlink the first segment file of the rel,
- * but just truncate it to zero length, and record a request to unlink it after
+ * For regular relations, we don't always unlink the first segment file,
+ * depending on the WAL level.  If XLogNeedRelFileTombstones() is true, we
+ * just truncate it to zero length, and record a request to unlink it after
  * the next checkpoint.  Additional segments can be unlinked immediately,
  * however.  Leaving the empty file in place prevents that relfilenode
  * number from being reused.  The scenario this protects us from is:
@@ -321,7 +322,8 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 	/*
 	 * Delete or truncate the first segment.
 	 */
-	if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
+	if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode) ||
+		!XLogNeedRelFileTombstones())
 	{
 		if (!RelFileNodeBackendIsTemp(rnode))
 		{
@@ -349,7 +351,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		/* Prevent other backends' fds from holding on to the disk space */
 		ret = do_truncate(path);
 
-		/* Register request to unlink first segment later */
+		/* Leave the file as a tombstone, to be unlinked at checkpoint time. */
 		register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
 	}
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 75ec1073bd..cca04a6aa8 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -207,6 +207,12 @@ extern PGDLLIMPORT int wal_level;
 /* Do we need to WAL-log information required only for logical replication? */
 #define XLogLogicalInfoActive() (wal_level >= WAL_LEVEL_LOGICAL)
 
+/*
+ * Is the WAL-level so low that it 

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2021-03-04 Thread Alvaro Herrera
On 2021-Mar-04, Alvaro Herrera wrote:

> I don't know where do __WSAFDIsSet and __imp_select come from or what to
> do about them.  Let's see if adding pgport and pgcommon fixes things.

Indeed all those other problems were fixed and these remain.  New
failure is:

"C:\projects\postgresql\pgsql.sln" (default target) (1) ->
6007"C:\projects\postgresql\libpq_pipeline.vcxproj" (default target) (55) ->
6008(Link target) -> 
6009  libpq_pipeline.obj : error LNK2019: unresolved external symbol 
__WSAFDIsSet referenced in function test_pipelined_insert 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
6010  libpq_pipeline.obj : error LNK2019: unresolved external symbol 
__imp_select referenced in function test_pipelined_insert 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
6011  .\Release\libpq_pipeline\libpq_pipeline.exe : fatal error LNK1120: 2 
unresolved externals [C:\projects\postgresql\libpq_pipeline.vcxproj]

I did notice that isolationtester.c is using select(), and one
difference is that it includes  which libpq_pipeline.c
does not -- and it also pulls in ws2_32.lib.  Let's see if those two
changes fix things.

-- 
Álvaro Herrera   Valdivia, Chile
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0553279314..c87b0ce911 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res);

   
  
+
+ 
+  PGRES_PIPELINE_SYNC
+  
+   
+The PGresult represents a
+synchronization point in pipeline mode, requested by 
+.
+This status occurs only when pipeline mode has been selected.
+   
+  
+ 
+
+ 
+  PGRES_PIPELINE_ABORTED
+  
+   
+The PGresult represents a pipeline that has
+received an error from the server.  PQgetResult
+must be called repeatedly, and each time it will return this status code
+until the end of the current pipeline, at which point it will return
+PGRES_PIPELINE_SYNC and normal processing can
+resume.
+   
+  
+ 
+
 
 
 If the result status is PGRES_TUPLES_OK or
@@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn);
 
  
 
+ 
+  Pipeline Mode
+
+  
+   libpq
+   pipeline mode
+  
+
+  
+   pipelining
+   in libpq
+  
+
+  
+   batch mode
+   in libpq
+  
+
+  
+   libpq pipeline mode allows applications to
+   send a query without having to read the result of the previously
+   sent query.  Taking advantage of the pipeline mode, a client will wait
+   less for the server, since multiple queries/results can be sent/
+   received in a single network transaction.
+  
+
+  
+   While pipeline mode provides a significant performance boost, writing
+   clients using the pipeline mode is more complex because it involves
+   managing a queue of pending queries and finding which result
+   corresponds to which query in the queue.
+  
+
+  
+   Pipeline mode also generally consumes more memory on both the client and server,
+   though careful and aggressive management of the send/receive queue can mitigate
+   this.  This applies whether or not the connection is in blocking or non-blocking
+   mode.
+  
+
+  
+   Using Pipeline Mode
+
+   
+To issue pipelines, the application must switch a connection into pipeline mode.
+Enter pipeline mode with 
+or test whether pipeline mode is active with
+.
+In pipeline mode, only asynchronous operations
+are permitted, and COPY is disallowed.
+Using any synchronous command execution functions,
+such as PQfn, or PQexec
+and its sibling functions, is an error condition.
+   
+
+   
+
+ It is best to use pipeline mode with libpq in
+ non-blocking mode. If used
+ in blocking mode it is possible for a client/server deadlock to occur.
+  
+   
+The client will block trying to send queries to the server, but the
+server will block trying to send results to the client from queries
+it has already processed. This only occurs when the client sends
+enough queries to fill both its output buffer and the server's receive
+buffer before it switches to processing input from the server,
+but it's hard to predict exactly when that will happen.
+   
+  
+
+   
+
+   
+Issuing Queries
+
+
+ After entering pipeline mode, the application dispatches requests using
+ , 
+ or its prepared-query sibling
+ .
+ These requests are queued on the client-side until flushed to the server;
+ this occurs when  is used to
+ establish a synchronization point in the pipeline,
+ or when  is called.
+ The functions ,
+ , and
+  also work in pipeline mode.
+ Result processing is described below.
+
+
+
+ The server executes statements, and returns 

Re: Removing support for COPY FROM STDIN in protocol version 2

2021-03-04 Thread Tom Lane
Heikki Linnakangas  writes:
> Joseph, any chance we could see a backtrace or some other details from 
> those crashes?

+1

> 'drongo' just reported linker errors:
> postgres.def : error LNK2001: unresolved external symbol 
> GetOldFunctionMessage 
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> postgres.def : error LNK2001: unresolved external symbol errfunction 
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> postgres.def : error LNK2001: unresolved external symbol pq_getstring 
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> postgres.def : error LNK2001: unresolved external symbol pq_putbytes 
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> Release/postgres/postgres.lib : fatal error LNK1120: 4 unresolved 
> externals [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> Done Building Project 
> "c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj" (default 
> targets) -- FAILED.

As far as that goes, I think suspicion has to fall on this:

  Not re-generating POSTGRES.DEF, file already exists.

which gendef.pl prints if it thinks the def file is newer than
all the inputs.  So either drongo had some kind of clock skew
issue, or that bit of logic in gendef.pl has some unobvious bug.

(I say "had" because I see the next run went fine.)

regards, tom lane




Re: Add .log file ending to tap test log files

2021-03-04 Thread Alvaro Herrera
On 2021-Mar-04, Andres Freund wrote:

> Right now it's harder than necessary to capture the log output from tap
> tests because the the regression tests files don't end with a common
> file ending with other types of logs.  They're
>   # Open the test log file, whose name depends on the test name.
>   $test_logfile = basename($0);
>   $test_logfile =~ s/\.[^.]+$//;
>   $test_logfile = "$log_path/regress_log_$test_logfile";
>
> This was essentially introduced in 1ea06203b82: "Improve logging of TAP 
> tests."

You're misreading this code (I did too): there's no "_logfile" suffix --
$test_logfile is the name of a single variable, it's not $test followed
by _logfile.  So the name is "regress_log_001_FOOBAR" with the basename
at the end.  But I agree:

> Would anybody object to replacing _logfile with .log? I realize that'd
> potentially would cause some short-term pain on the buildfarm, but I
> think it'd improve things longer term.

Let's add a .log prefix.  And also, I would propose a more extensive
renaming, if we're going to do it -- I dislike that the server log files
start with "00x" and the regress ones have the 00x bit in the middle of
the name.  So how about we make this
  $log_path/$test_logfile.regress.log.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Los dioses no protegen a los insensatos.  Éstos reciben protección de
otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)




Re: [PATCH] postgres-fdw: column option to override foreign types

2021-03-04 Thread Tom Lane
"Dian M Fay"  writes:
> On Thu Mar 4, 2021 at 9:28 AM EST, Georgios Kokolatos wrote:
>> I am afraid I will have to :-1: this patch.

> I see room for interpretation in the design here, although I have
> admittedly not been looking at it for very long. `CREATE/ALTER FOREIGN
> TABLE` take the user at their word about types, which only map 1:1 for a
> foreign Postgres server anyway.

Right.

> In make_tuple_from_result_row, incoming
> values start as strings until they're converted to their target types --
> again, with no guarantee that those types match those on the remote
> server.

The data conversion itself provides a little bit of security --- for
instance, converting 'foobar' to int or timestamp will fail.  It's
not bulletproof, but on the other hand there are indeed situations
where you don't want to declare the column locally with exactly the
type the remote server is using, so trying to be bulletproof would
be counterproductive.

I am not, however, any more impressed than the other respondents with
the solution you've proposed.  For one thing, this can only help if
the local type is known to the remote server, which seems to eliminate
fifty per cent of the use-cases for intentional differences in type.
(That is, isn't it equally as plausible that the local type is an
enum you didn't bother making on the remote side?)  But a bigger issue
is that shipping
WHERE foreigncol::text = 'one'::text
to the remote server is not a nice solution even if it works.  It will,
for example, defeat use of a normal index on foreigncol.  It'd likely
be just as inefficient for remote joins.

What'd be better, if we could do it, is to ship the clause in
the form
WHERE foreigncol = 'one'
that is, instead of plastering a cast on the Var, try to not put
any explicit cast on the constant.  That fixes your original use
case better than what you've proposed, and I think it might be
possible to do it unconditionally instead of needing a hacky
column property to enable it.  The reason this could be okay
is that it seems reasonable for postgres_fdw to rely on the
core parser's heuristic that an unknown-type literal is the
same type as what it's being compared to.  So, if we are trying
to deparse something of the form "foreigncol operator constant",
and the foreigncol and constant are of the same type locally,
we could leave off the cast on the constant.  (There might need
to be some restrictions about the operator taking those types
natively with no cast, not sure; also this doesn't apply to
constants that are going to be printed as non-string literals.)

Slipping this heuristic into the code structure of deparse.c
might be rather messy, though.  I've not looked at just how
to implement it.

regards, tom lane




Re: pg_amcheck contrib application

2021-03-04 Thread Robert Haas
On Thu, Mar 4, 2021 at 12:27 PM Robert Haas  wrote:
> More in a bit, need to grab some lunch.

Moving on to the tests, in 003_check.pl, I think it would be slightly
better if relation_toast were to select ct.oid::regclass and then just
have the caller use that value directly. We'd certainly want to do
that if the name could contain any characters that might require
quoting. Here that's not possible, but I think we might as well use
the same technique anyway.

I'm not sure how far to go with it, but I think that you might want to
try to enhance the logging in some of the cases where the TAP tests
might fail. In particular, if either of these trip in the buildfarm,
it doesn't seem like it will be too easy to figure out why they
failed:

+fail('Xid thresholds not as expected');
+fail('Page layout differs from our expectations');

You might want to rephrase the message to incorporate the values that
triggered the failure, e.g. "datfrozenxid $datfrozenxid is not between
3 and $relfrozenxid", "expected (a,b) = (12345678,abcdefg) but got
($x,$y)", so that if the buildfarm happens to fail there's a shred of
hope that we might be able to guess the reason from the message. You
could also give some thought to whether there are any tests that can
be improved in similar ways. Test::More is nice in that when you run a
test with eq() or like() and it fails it will tell you about the input
values in the diagnostic, but if you do something like is($x < 4, ...)
instead of cmp_ok($x, '<', 4, ...) then you lose that. I'm not saying
you're doing that exact thing, just saying that looking through the
test code with an eye to finding things where you could output a
little more info about a potential failure might be a worthwhile
activity.

If it were me, I would get rid of ROWCOUNT and have a list of
closures, and then loop over the list and call each one e.g. my
@corruption = ( sub { ... }, sub { ... }, sub { ... }) or maybe
something like what I did with @scenario in
src/bin/pg_verifybackup/t/003_corruption.pl, but this is ultimately a
style preference and I think the way you actually did it is also
reasonable, and some people might find it more readable than the other
way.

The name int4_fickle_ops is positively delightful and I love having a
test case like this.

On the whole, I think these tests look quite solid. I am a little
concerned, as you may gather from the comment above, that they will
not survive contact with the buildfarm, because they will turn out to
be platform or OS-dependent in some way. However, I can see that
you've taken steps to avoid such dependencies, and maybe we'll be
lucky and those will work. Also, while I am suspicious something's
going to break, I don't know what it's going to be, so I can't suggest
any method to avoid it. I think we'll just have to keep an eye on the
buildfarm post-commit and see what crops up.

Turning to the documentation, I see that it is documented that a bare
command-line argument can be a connection string rather than a
database name. That sounds like a good plan, but when I try
'pg_amcheck sslmode=require' it does not work: FATAL:  database
"sslmode=require" does not exist. The argument to -e is also
documented to be a connection string, but that also seems not to work.
Some thought might need to be given to what exactly these connection
opens are supposed to mean. Like, do the connection options I set via
-e apply to all the connections I make, or just the one to the
maintenance database? How do I set connection options for connections
to databases whose names aren't specified explicitly but are
discovered by querying pg_database? Maybe instead of allowing these to
be a connection string, we should have a separate option that can be
used just for the purpose of setting connection options that then
apply to all connections. That seems a little bit oddly unlike other
tools, but if I want sslmode=verify-ca or something on all my
connections, there should be an easy way to get it.

The documentation makes many references to patterns, but does not
explain what a pattern is. I see that psql's documentation contains an
explanation, and pg_dump's documentation links to psql's
documentation. pg_amcheck should probably link to psql's
documentation, too.

In the documentation for -d, you say that "If -a --all is also
specified, -d --database does not additionally affect which databases
are checked." I suggest replacing "does not additionally affect which
databases are checked" with "has no effect."

In two places you say "without regard for" but I think it should be
"without regard to".

In the documentation for --no-strict-names you use "nor" where I think
it should say "or".

I kind of wonder whether we need --quiet. It seems like right now it
only does two things. One is to control complaints about ignoring the
startblock and endblock options, but I don't agree with that behavior
anyway. The other is control whether we complain about unmatched
patterns, but I think 

Re: Removing support for COPY FROM STDIN in protocol version 2

2021-03-04 Thread Andrew Dunstan


On 3/4/21 3:55 PM, Heikki Linnakangas wrote:
>
>
>
>
> 'drongo' just reported linker errors:
>
> postgres.def : error LNK2001: unresolved external symbol
> GetOldFunctionMessage
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> postgres.def : error LNK2001: unresolved external symbol errfunction
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> postgres.def : error LNK2001: unresolved external symbol pq_getstring
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> postgres.def : error LNK2001: unresolved external symbol pq_putbytes
> [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> Release/postgres/postgres.lib : fatal error LNK1120: 4 unresolved
> externals [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
> Done Building Project
> "c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj" (default
> targets) -- FAILED.
>
> Looks like it wasn't a clean build, those functions and all the
> callers were removed by the patch. That's a separate issue than on
> 'walleye' - unless that was also not a completely clean build?
>
>


Yes, pilot error :-)(


It's rerunning and should report clean shortly


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: CI/windows docker vs "am a service" autodetection on windows

2021-03-04 Thread Alvaro Herrera
On 2021-Mar-04, Andres Freund wrote:

> > > There does seem to be isatty(), so we could improve the case of
> > > pg_ctl/postgres run interactively without breaking a sweat. And there is
> > > fstat() too, so if stderr in a service is something distinguishable...
> > 
> > We seem to have used that at some point, but commit
> > a967613911f7ef7b6387b9e8718f0ab8f0c4d9c8 got rid of it...
> 
> Hm. The bug #13592 referenced in that commit appears to be about
> something else.  Looks to be #13594
> https://postgr.es/m/20150828104658.2089.83265%40wrigleys.postgresql.org

Yeah, that's a typo in the commit message.

> > But maybe apply it in a combination.
> 
> Yea, that's what I was thinking.

That makes sense.  At the time we were not thinking (*I* was not
thinking, for sure) that you could have a not-a-service process that
runs inside a service.  The fixed bug was in the same direction that you
want to fix now, just differently: the bare "isatty" test was
considering too many things as under a service, and replaced it with the
pgwin32_is_service which considers a different set of too many things as
under a service.  I agree with the idea that *both* tests have to pass
in order to consider it as under a service.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"No hay hombre que no aspire a la plenitud, es decir,
la suma de experiencias de que un hombre es capaz"




Re: PROXY protocol support

2021-03-04 Thread Jan Wieck

On 3/4/21 3:40 PM, Magnus Hagander wrote:

On Thu, Mar 4, 2021 at 9:29 PM Jan Wieck  wrote:

This looks like it would only need a few extra protocol messages to be
understood by the backend. It might be possible to implement that with
the loadable wire protocol extensions proposed here:

https://commitfest.postgresql.org/32/3018/


Actually the whole point of it is that it *doesn't* need any new
protocol messages. And that it *wraps* whatever is there, definitely
doesn't replace it. It should equally be wrapping whatever an
extension uses.

So while the base topic is not unrelated, I don't think there is any
overlap between these.


I might be missing something here, but isn't sending some extra, 
informational *header*, which is understood by the backend, in essence a 
protocol extension?



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: Huge memory consumption on partitioned table with FKs

2021-03-04 Thread Tom Lane
Amit Langote  writes:
> Updated patch attached.

This claim seems false on its face:

> All child constraints of a given foreign key constraint can use the
> same RI query and the resulting plan, that is, no need to create as
> many copies of the query and the plan as there are partitions, as
> happens now due to the child constraint OID being used in the key
> for ri_query_cache.

What if the child tables don't have the same physical column numbers
as the parent?  The comment claiming that it's okay if riinfo->fk_attnums
doesn't match seems quite off point, because the query plan is still
going to need to use the correct column numbers.  Even if column numbers
are the same, the plan would also contain table and index OIDs that could
only be right for one partition.

I could imagine translating a parent plan to apply to a child instead of
building it from scratch, but that would take a lot of code we don't have
(there's no rewriteHandler infrastructure for plan nodes).

Maybe I'm missing something, because I see that the cfbot claims
this is passing, and I'd sure like to think that our test coverage
is not so thin that it'd fail to detect probing the wrong partition
for foreign key matches.  But that's what it looks like this patch
will do.

regards, tom lane




Re: Removing support for COPY FROM STDIN in protocol version 2

2021-03-04 Thread Heikki Linnakangas

On 04/03/2021 22:04, Tom Lane wrote:

Heikki Linnakangas  writes:

I concur that 0001 attached is committable.  I have not looked at
your 0002, though.



Removed the entry from nls.mk, and pushed 0001. Thanks!


It seems that buildfarm member walleye doesn't like this.
Since nothing else is complaining, I confess bafflement
as to why.  walleye seems to be our only active mingw animal,
so maybe there's a platform dependency somewhere ... but
how would (mostly) removal of code expose that?


Strange indeed. The commands that are crashing seem far detached from 
any FE/BE protocol handling, and I don't see any other pattern either:


2021-03-04 05:08:45.953 EST [4080:94] DETAIL:  Failed process was 
running: copy (insert into copydml_test default values) to stdout;


2021-03-04 05:09:22.690 EST [4080:100] DETAIL:  Failed process was 
running: CREATE INDEX CONCURRENTLY concur_index7 ON concur_heap(f1);


2021-03-04 05:09:33.546 EST [4080:106] DETAIL:  Failed process was 
running: ANALYZE vaccluster;


2021-03-04 05:09:42.452 EST [4080:112] DETAIL:  Failed process was 
running: FETCH BACKWARD 1 FROM foo24;


2021-03-04 05:10:06.874 EST [4080:118] DETAIL:  Failed process was 
running: REFRESH MATERIALIZED VIEW CONCURRENTLY mvtest_tvmm;


2021-03-04 05:12:23.890 EST [4080:125] DETAIL:  Failed process was 
running: CREATE SUBSCRIPTION regress_testsub CONNECTION 'testconn' 
PUBLICATION testpub;


2021-03-04 05:15:46.421 EST [4080:297] DETAIL:  Failed process was 
running: INSERT INTO xmltest VALUES (3, '

Dare I suggest a compiler bug? gcc 8.1 isn't the fully up-to-date, 
although I don't know if there's a newer gcc available on this platform. 
Joseph, any chance we could see a backtrace or some other details from 
those crashes?




'drongo' just reported linker errors:

postgres.def : error LNK2001: unresolved external symbol 
GetOldFunctionMessage 
[c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
postgres.def : error LNK2001: unresolved external symbol errfunction 
[c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
postgres.def : error LNK2001: unresolved external symbol pq_getstring 
[c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
postgres.def : error LNK2001: unresolved external symbol pq_putbytes 
[c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
Release/postgres/postgres.lib : fatal error LNK1120: 4 unresolved 
externals [c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj]
Done Building Project 
"c:\\prog\\bf\\root\\HEAD\\pgsql.build\\postgres.vcxproj" (default 
targets) -- FAILED.


Looks like it wasn't a clean build, those functions and all the callers 
were removed by the patch. That's a separate issue than on 'walleye' - 
unless that was also not a completely clean build?


- Heikki




Re: Extensibility of the PostgreSQL wire protocol

2021-03-04 Thread Jan Wieck

On 3/3/21 2:43 PM, Peter Eisentraut wrote:


I think, the way the abstractions are chosen in this patch, it is still
very much tied to how the libpq protocol works.  For example, there is a
cancel key and a ready-for-query message.  Some of the details of the
simple and the extended query are exposed.  So you could create a
protocol that has a different way of encoding the payloads, as your
telnet example does, but I don't believe that you could implement a
competitor's protocol through this.  Unless you have that done and want
to show it?



Correct, a lot of what this patch does is to allow a developer of such 
protocol extension to just "extend" what the server side libpq does, by 
building a wrapper around the function they are interested in. That 
doesn't change the protocol, but rather allows additional functionality 
like the telemetry data gathering, Fabrizio was talking about.


The telnet_srv tutorial extension (which needs more documentation) is an 
example of how far one can go by replacing those funcitons, in that it 
actually implements a very different wire protocol. This one still fits 
into the regular libpq message flow.


Another possibility, and this is what is being used by the AWS team 
implementing the TDS protocol for Babelfish, is to completely replace 
the entire TCOP mainloop function PostgresMain(). That is of course a 
rather drastic move and requires a lot more coding on the extension 
side, but the whole thing was developed that way from the beginning and 
it is working. I don't have a definitive date when that code will be 
presented. Kuntal or Prateek may be able to fill in more details.



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2021-03-04 Thread Alvaro Herrera
On 2021-Mar-04, Alvaro Herrera wrote:

> v31.

Got this:

libpq_pipeline.obj : error LNK2019: unresolved external symbol __WSAFDIsSet 
referenced in function test_pipelined_insert 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5019libpq_pipeline.obj : error LNK2019: unresolved external symbol __imp_select 
referenced in function test_pipelined_insert 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5020libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_snprintf 
referenced in function test_pipelined_insert 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5021libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_vfprintf 
referenced in function pg_fatal_impl 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5022libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_fprintf 
referenced in function pg_fatal_impl 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5023libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_printf 
referenced in function pg_fatal_impl 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5024libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_strerror 
referenced in function test_pipelined_insert 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5025libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_strdup 
referenced in function main [C:\projects\postgresql\libpq_pipeline.vcxproj]
5026libpq_pipeline.obj : error LNK2019: unresolved external symbol pfree 
referenced in function test_singlerowmode 
[C:\projects\postgresql\libpq_pipeline.vcxproj]
5027libpq_pipeline.obj : error LNK2019: unresolved external symbol psprintf 
referenced in function test_singlerowmode 
[C:\projects\postgresql\libpq_pipeline.vcxproj]

pg_snprintf, pg_vfprintf, pg_fprintf, pg_printf, pg_strerror are in pgport.
pg_strdup and pfree, psprintf are in pgcommon.

I don't know where do __WSAFDIsSet and __imp_select come from or what to
do about them.  Let's see if adding pgport and pgcommon fixes things.

-- 
Álvaro Herrera39°49'30"S 73°17'W
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0553279314..c87b0ce911 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res);

   
  
+
+ 
+  PGRES_PIPELINE_SYNC
+  
+   
+The PGresult represents a
+synchronization point in pipeline mode, requested by 
+.
+This status occurs only when pipeline mode has been selected.
+   
+  
+ 
+
+ 
+  PGRES_PIPELINE_ABORTED
+  
+   
+The PGresult represents a pipeline that has
+received an error from the server.  PQgetResult
+must be called repeatedly, and each time it will return this status code
+until the end of the current pipeline, at which point it will return
+PGRES_PIPELINE_SYNC and normal processing can
+resume.
+   
+  
+ 
+
 
 
 If the result status is PGRES_TUPLES_OK or
@@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn);
 
  
 
+ 
+  Pipeline Mode
+
+  
+   libpq
+   pipeline mode
+  
+
+  
+   pipelining
+   in libpq
+  
+
+  
+   batch mode
+   in libpq
+  
+
+  
+   libpq pipeline mode allows applications to
+   send a query without having to read the result of the previously
+   sent query.  Taking advantage of the pipeline mode, a client will wait
+   less for the server, since multiple queries/results can be sent/
+   received in a single network transaction.
+  
+
+  
+   While pipeline mode provides a significant performance boost, writing
+   clients using the pipeline mode is more complex because it involves
+   managing a queue of pending queries and finding which result
+   corresponds to which query in the queue.
+  
+
+  
+   Pipeline mode also generally consumes more memory on both the client and server,
+   though careful and aggressive management of the send/receive queue can mitigate
+   this.  This applies whether or not the connection is in blocking or non-blocking
+   mode.
+  
+
+  
+   Using Pipeline Mode
+
+   
+To issue pipelines, the application must switch a connection into pipeline mode.
+Enter pipeline mode with 
+or test whether pipeline mode is active with
+.
+In pipeline mode, only asynchronous operations
+are permitted, and COPY is disallowed.
+Using any synchronous command execution functions,
+such as PQfn, or PQexec
+and its sibling functions, is an error condition.
+   
+
+   
+
+ It is best to use pipeline mode with libpq in
+ non-blocking mode. If used
+ in blocking mode it is possible for a client/server deadlock to occur.
+  
+   
+The client will block trying to send queries to the server, but the
+server will block trying to send results 

Re: CI/windows docker vs "am a service" autodetection on windows

2021-03-04 Thread Andres Freund
Hi,

On 2021-03-04 21:36:23 +0100, Magnus Hagander wrote:
> > I think that's a good answer for pg_ctl - not so sure about postgres
> > itself, at least once it's up and running. I don't know what lead to all
> > of this autodetection stuff, but is there the possibility of blocking on
> > whatever stderr is set too as a service?
> >
> > Perhaps we could make the service detection more reliable by checking
> > whether stderr is actually something useful?
> 
> So IIRC, and mind that this is like 15 years ago, there is something
> that looks like stderr, but the contents are thrown away. It probably
> exists specifically so that programs won't crash when run as a
> service...

Yea, that'd make sense.

I wish we had tests for the service stuff, but that's from long before
there were tap tests...


> > There does seem to be isatty(), so we could improve the case of
> > pg_ctl/postgres run interactively without breaking a sweat. And there is
> > fstat() too, so if stderr in a service is something distinguishable...
> 
> We seem to have used that at some point, but commit
> a967613911f7ef7b6387b9e8718f0ab8f0c4d9c8 got rid of it...

Hm. The bug #13592 referenced in that commit appears to be about
something else.  Looks to be #13594
https://postgr.es/m/20150828104658.2089.83265%40wrigleys.postgresql.org


> But maybe apply it in a combination.

Yea, that's what I was thinking.


Gah, I don't really want to know anything about windows, I just want to
hack on aio with proper working CI.

Greetings,

Andres Freund




Re: PROXY protocol support

2021-03-04 Thread Magnus Hagander
On Thu, Mar 4, 2021 at 8:45 PM Jacob Champion  wrote:
>
> On Thu, 2021-03-04 at 10:42 +0900, Tatsuo Ishii wrote:
> > Is there any formal specification for the "a protocol common and very
> > light weight in proxies"?
>
> See
>
> https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt

Yeah, it's currently in one of the comments, but should probably be
added to the docs side as well.

And yes tests :) Probably not a regression test, but some level of tap
testing should definitely be added. We'll just have to find a way to
do that without making haproxy a dependency to run the tests :)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: PROXY protocol support

2021-03-04 Thread Magnus Hagander
On Thu, Mar 4, 2021 at 9:07 PM Jacob Champion  wrote:
>
> On Wed, 2021-03-03 at 10:39 +0100, Magnus Hagander wrote:
> > On Wed, Mar 3, 2021 at 10:00 AM Magnus Hagander  wrote:
> > > Another option would of course be to listen on a separate port for it,
> > > which seems to be the "haproxy way". That would be slightly more code
> > > (we'd still want to keep the code for validating the list of trusted
> > > proxies I'd say), but maybe worth doing?
> >
> > In order to figure that out, I hacked up a poc on that. Once again
> > without updates to the docs, but shows approximately how much code
> > complexity it adds (not much).
>
> From a configuration perspective, I like that the separate-port
> approach can shift the burden of verifying trust to an external
> firewall, and that it seems to match the behavior of other major server
> software. But I don't have any insight into the relative security of
> the two options in practice; hopefully someone else can chime in.

Yeah I think that and the argument that the spec explicitly says it
should be on it's own port is the advantage. The disadvantage is,
well, more ports and more configuration. But it does definitely make a
more clean separation of concerns.


> >memset((char *) , 0, sizeof(hints));
> >hints.ai_flags = AI_NUMERICHOST;
> >hints.ai_family = AF_UNSPEC;
> >
> >ret = pg_getaddrinfo_all(tok, NULL, , _result);
>
> Idle thought I had while setting up a local test rig: Are there any
> compelling cases for allowing PROXY packets to arrive over Unix
> sockets? (By which I mean, the proxy is running on the same machine as
> Postgres, and connects to it using the .s.PGSQL socket file instead of
> TCP.) Are there cases where you want some other software to interact
> with the TCP stack instead of Postgres, but it'd still be nice to have
> the original connection information available?

I'm uncertain what that usecase would be for something like haproxy,
tbh. It can't do connection pooling, so adding it on the same machine
as postgres itself wouldn't really add anything, I think?

Iid think about the other end, if you had a proxy on a different
machine accepting unix connections and passing them on over
PROXY-over-tcp. But I doubt it's useful to know it was unix in that
case  (since it still couldn't do peer or such for the auth) --
instead, that seems like an argument where it'd be better to proxy
without using PROXY and just letting the IP address be.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: PROXY protocol support

2021-03-04 Thread Magnus Hagander
On Thu, Mar 4, 2021 at 9:29 PM Jan Wieck  wrote:
>
> On 3/4/21 2:45 PM, Jacob Champion wrote:
> > On Thu, 2021-03-04 at 10:42 +0900, Tatsuo Ishii wrote:
> >> Is there any formal specification for the "a protocol common and very
> >> light weight in proxies"?
> >
> > See
> >
> >  https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
> >
> > which is maintained by HAProxy Technologies.
> >
> > --Jacob
> >
>
> This looks like it would only need a few extra protocol messages to be
> understood by the backend. It might be possible to implement that with
> the loadable wire protocol extensions proposed here:
>
> https://commitfest.postgresql.org/32/3018/

Actually the whole point of it is that it *doesn't* need any new
protocol messages. And that it *wraps* whatever is there, definitely
doesn't replace it. It should equally be wrapping whatever an
extension uses.

So while the base topic is not unrelated, I don't think there is any
overlap between these.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: [POC] Fast COPY FROM command for the table with foreign partitions

2021-03-04 Thread Justin Pryzby
I think this change to the regression tests is suspicous:

-CONTEXT:  remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES ($1, $2)
-COPY rem2, line 1: "-1 xyzzy"
+CONTEXT:  COPY loc2, line 1: "-1   xyzzy"
+remote SQL command: COPY public.loc2(f1, f2) FROM STDIN 
+COPY rem2, line 2

I think it shouldn't say "COPY rem2, line 2" but rather a remote version of the
same:
|COPY loc2, line 1: "-1   xyzzy"

I have rebased this on my side over yesterday's libpq changes - I'll send it if
you want, but it's probably just as easy if you do it.

-- 
Justin




Re: CI/windows docker vs "am a service" autodetection on windows

2021-03-04 Thread Magnus Hagander
On Thu, Mar 4, 2021 at 9:30 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-03-04 21:08:30 +0100, Magnus Hagander wrote:
> > The problem with doing it at register time is that everybody who
> > builds an installer for PostgreSQL will then have to do it in their
> > own registration (I'm pretty sure most of them don't use pg_ctl
> > register).
>
> Well, hm, maybe they should change that?
>
>
> > The same thing in pgwin32_doRunAsService() might help with that.
>
> What do you mean by this?

I mean controlling this flag by entry into pgwin32_doRunAsService().
So that when we start *postgres* we pass a parameter along saying that
it's a service and should use eventlog for the early exit.
pgwin32_doRunAsService() will (of course) only be called when started
with runservice. That would, I think, sort out the problem for the
postgres processes, and leave us with just pg_ctl to figure out.


> > But then we'd have to figure out what to do if pg_ctl fails prior to
> > reaching that point... There aren't that many such paths, but there
> > are some.
> >
> > Just throwing out ideas without spending time thinking about it, maybe
> > log to *both* in the case when we pick by it by autodetection?
>
> I think that's a good answer for pg_ctl - not so sure about postgres
> itself, at least once it's up and running. I don't know what lead to all
> of this autodetection stuff, but is there the possibility of blocking on
> whatever stderr is set too as a service?
>
> Perhaps we could make the service detection more reliable by checking
> whether stderr is actually something useful?

So IIRC, and mind that this is like 15 years ago, there is something
that looks like stderr, but the contents are thrown away. It probably
exists specifically so that programs won't crash when run as a
service...


> There does seem to be isatty(), so we could improve the case of
> pg_ctl/postgres run interactively without breaking a sweat. And there is
> fstat() too, so if stderr in a service is something distinguishable...

We seem to have used that at some point, but commit
a967613911f7ef7b6387b9e8718f0ab8f0c4d9c8 got rid of it... But maybe
apply it in a combination.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: CI/windows docker vs "am a service" autodetection on windows

2021-03-04 Thread Andres Freund
Hi,

On 2021-03-04 21:08:30 +0100, Magnus Hagander wrote:
> The problem with doing it at register time is that everybody who
> builds an installer for PostgreSQL will then have to do it in their
> own registration (I'm pretty sure most of them don't use pg_ctl
> register).

Well, hm, maybe they should change that?


> The same thing in pgwin32_doRunAsService() might help with that.

What do you mean by this?


> But then we'd have to figure out what to do if pg_ctl fails prior to
> reaching that point... There aren't that many such paths, but there
> are some.
>
> Just throwing out ideas without spending time thinking about it, maybe
> log to *both* in the case when we pick by it by autodetection?

I think that's a good answer for pg_ctl - not so sure about postgres
itself, at least once it's up and running. I don't know what lead to all
of this autodetection stuff, but is there the possibility of blocking on
whatever stderr is set too as a service?


Perhaps we could make the service detection more reliable by checking
whether stderr is actually something useful?

There does seem to be isatty(), so we could improve the case of
pg_ctl/postgres run interactively without breaking a sweat. And there is
fstat() too, so if stderr in a service is something distinguishable...

Greetings,

Andres Freund




Re: PROXY protocol support

2021-03-04 Thread Jan Wieck

On 3/4/21 2:45 PM, Jacob Champion wrote:

On Thu, 2021-03-04 at 10:42 +0900, Tatsuo Ishii wrote:

Is there any formal specification for the "a protocol common and very
light weight in proxies"?


See

 https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt

which is maintained by HAProxy Technologies.

--Jacob



This looks like it would only need a few extra protocol messages to be 
understood by the backend. It might be possible to implement that with 
the loadable wire protocol extensions proposed here:


https://commitfest.postgresql.org/32/3018/


Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: CI/windows docker vs "am a service" autodetection on windows

2021-03-04 Thread Magnus Hagander
On Thu, Mar 4, 2021 at 8:33 PM Andrew Dunstan  wrote:
>
>
> On 3/4/21 2:08 PM, Andres Freund wrote:
> > [...] pgwin32_is_service() doesn't actually reliably detect if running as
> > a service - it's a heuristic that also triggers when running postgres
> > within a windows docker container (presumably because that itself is run
> > from within a service?).
> >
> >
> > ISTM that that's a problem, and is likely to become more of a problem
> > going forward (assuming that docker on windows will become more
> > popular).
> >
> >
> > My opinion is that the whole attempt at guessing whether we are running
> > as a service is a bad idea. This isn't the first time to be a problem,
> > see e.g. [1].
> >
> > Why don't we instead have pgwin32_doRegister() include a parameter that
> > indicates we're running as a service and remove all the heuristics?
>
>
>
> I assume you mean a postmaster parameter, that would be set via pg_ctl?
> Seems reasonable.

The problem with doing it at register time is that everybody who
builds an installer for PostgreSQL will then have to do it in their
own registration (I'm pretty sure most of them don't use pg_ctl
register).

The same thing in pgwin32_doRunAsService() might help with that. But
then we'd have to figure out what to do if pg_ctl fails prior to
reaching that point... There aren't that many such paths, but there
are some.

Just throwing out ideas without spending time thinking about it, maybe
log to *both* in the case when we pick by it by autodetection?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: PROXY protocol support

2021-03-04 Thread Jacob Champion
On Wed, 2021-03-03 at 10:39 +0100, Magnus Hagander wrote:
> On Wed, Mar 3, 2021 at 10:00 AM Magnus Hagander  wrote:
> > Another option would of course be to listen on a separate port for it,
> > which seems to be the "haproxy way". That would be slightly more code
> > (we'd still want to keep the code for validating the list of trusted
> > proxies I'd say), but maybe worth doing?
> 
> In order to figure that out, I hacked up a poc on that. Once again
> without updates to the docs, but shows approximately how much code
> complexity it adds (not much).

From a configuration perspective, I like that the separate-port
approach can shift the burden of verifying trust to an external
firewall, and that it seems to match the behavior of other major server
software. But I don't have any insight into the relative security of
the two options in practice; hopefully someone else can chime in.

>memset((char *) , 0, sizeof(hints));
>hints.ai_flags = AI_NUMERICHOST;
>hints.ai_family = AF_UNSPEC;
> 
>ret = pg_getaddrinfo_all(tok, NULL, , _result);

Idle thought I had while setting up a local test rig: Are there any
compelling cases for allowing PROXY packets to arrive over Unix
sockets? (By which I mean, the proxy is running on the same machine as
Postgres, and connects to it using the .s.PGSQL socket file instead of
TCP.) Are there cases where you want some other software to interact
with the TCP stack instead of Postgres, but it'd still be nice to have
the original connection information available?

--Jacob


Re: Removing support for COPY FROM STDIN in protocol version 2

2021-03-04 Thread Tom Lane
Heikki Linnakangas  writes:
>> I concur that 0001 attached is committable.  I have not looked at
>> your 0002, though.

> Removed the entry from nls.mk, and pushed 0001. Thanks!

It seems that buildfarm member walleye doesn't like this.
Since nothing else is complaining, I confess bafflement
as to why.  walleye seems to be our only active mingw animal,
so maybe there's a platform dependency somewhere ... but
how would (mostly) removal of code expose that?

regards, tom lane




Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2021-03-04 Thread Alvaro Herrera
On 2021-Mar-04, Alvaro Herrera wrote:

> I think the problem is that the project is called pipeline and not
> test_libpq, so there's no match in the name.  I'm going to rename the
> whole thing to src/test/modules/libpq_pipeline/ and see if the msvc
> tooling likes that better.

v31.

-- 
Álvaro Herrera   Valdivia, Chile
"I'm impressed how quickly you are fixing this obscure issue. I came from 
MS SQL and it would be hard for me to put into words how much of a better job
you all are doing on [PostgreSQL]."
 Steve Midgley, http://archives.postgresql.org/pgsql-sql/2008-08/msg0.php
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0553279314..c87b0ce911 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res);

   
  
+
+ 
+  PGRES_PIPELINE_SYNC
+  
+   
+The PGresult represents a
+synchronization point in pipeline mode, requested by 
+.
+This status occurs only when pipeline mode has been selected.
+   
+  
+ 
+
+ 
+  PGRES_PIPELINE_ABORTED
+  
+   
+The PGresult represents a pipeline that has
+received an error from the server.  PQgetResult
+must be called repeatedly, and each time it will return this status code
+until the end of the current pipeline, at which point it will return
+PGRES_PIPELINE_SYNC and normal processing can
+resume.
+   
+  
+ 
+
 
 
 If the result status is PGRES_TUPLES_OK or
@@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn);
 
  
 
+ 
+  Pipeline Mode
+
+  
+   libpq
+   pipeline mode
+  
+
+  
+   pipelining
+   in libpq
+  
+
+  
+   batch mode
+   in libpq
+  
+
+  
+   libpq pipeline mode allows applications to
+   send a query without having to read the result of the previously
+   sent query.  Taking advantage of the pipeline mode, a client will wait
+   less for the server, since multiple queries/results can be sent/
+   received in a single network transaction.
+  
+
+  
+   While pipeline mode provides a significant performance boost, writing
+   clients using the pipeline mode is more complex because it involves
+   managing a queue of pending queries and finding which result
+   corresponds to which query in the queue.
+  
+
+  
+   Pipeline mode also generally consumes more memory on both the client and server,
+   though careful and aggressive management of the send/receive queue can mitigate
+   this.  This applies whether or not the connection is in blocking or non-blocking
+   mode.
+  
+
+  
+   Using Pipeline Mode
+
+   
+To issue pipelines, the application must switch a connection into pipeline mode.
+Enter pipeline mode with 
+or test whether pipeline mode is active with
+.
+In pipeline mode, only asynchronous operations
+are permitted, and COPY is disallowed.
+Using any synchronous command execution functions,
+such as PQfn, or PQexec
+and its sibling functions, is an error condition.
+   
+
+   
+
+ It is best to use pipeline mode with libpq in
+ non-blocking mode. If used
+ in blocking mode it is possible for a client/server deadlock to occur.
+  
+   
+The client will block trying to send queries to the server, but the
+server will block trying to send results to the client from queries
+it has already processed. This only occurs when the client sends
+enough queries to fill both its output buffer and the server's receive
+buffer before it switches to processing input from the server,
+but it's hard to predict exactly when that will happen.
+   
+  
+
+   
+
+   
+Issuing Queries
+
+
+ After entering pipeline mode, the application dispatches requests using
+ , 
+ or its prepared-query sibling
+ .
+ These requests are queued on the client-side until flushed to the server;
+ this occurs when  is used to
+ establish a synchronization point in the pipeline,
+ or when  is called.
+ The functions ,
+ , and
+  also work in pipeline mode.
+ Result processing is described below.
+
+
+
+ The server executes statements, and returns results, in the order the
+ client sends them.  The server will begin executing the commands in the
+ pipeline immediately, not waiting for the end of the pipeline.
+ If any statement encounters an error, the server aborts the current
+ transaction and skips processing commands in the pipeline until the
+ next synchronization point established by PQsendPipeline.
+ (This remains true even if the commands in the pipeline would rollback
+ the transaction.)
+ Query processing resumes after the synchronization point.
+
+
+
+ It's fine for 

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2021-03-04 Thread Alvaro Herrera
On 2021-Mar-04, Alvaro Herrera wrote:

> v30 contains changes to hopefully make it build on MSVC.

Hm, that didn't work -- appveyor still says:

Project "C:\projects\postgresql\pgsql.sln" (1) is building 
"C:\projects\postgresql\pipeline.vcxproj" (75) on node 1 (default targets).
PrepareForBuild:
  Creating directory ".\Release\pipeline\".
  Creating directory ".\Release\pipeline\pipeline.tlog\".
InitializeBuildStatus:
  Creating ".\Release\pipeline\pipeline.tlog\unsuccessfulbuild" because 
"AlwaysCreate" was specified.
ClCompile:
  C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\CL.exe 
/c /Isrc/include /Isrc/include/port/win32 /Isrc/include/port/win32_msvc /Zi 
/nologo /W3 /WX- /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D __WIN32__ /D 
WIN32_STACK_RLIMIT=4194304 /D _CRT_SECURE_NO_DEPRECATE /D 
_CRT_NONSTDC_NO_DEPRECATE /D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise 
/Zc:wchar_t /Zc:forScope /Fo".\Release\pipeline\\" 
/Fd".\Release\pipeline\vc120.pdb" /Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 
/wd4090 /wd4267 /errorReport:queue /MP src/test/modules/test_libpq/pipeline.c
  pipeline.c
src/test/modules/test_libpq/pipeline.c(11): fatal error C1083: Cannot open 
include file: 'libpq-fe.h': No such file or directory 
[C:\projects\postgresql\pipeline.vcxproj]
Done Building Project "C:\projects\postgresql\pipeline.vcxproj" (default 
targets) -- FAILED.
Project "C:\projects\postgresql\pgsql.sln" (1) is building 
"C:\projects\postgresql\test_parser.vcxproj" (76) on node 1 (default targets).

I think the problem is that the project is called pipeline and not test_libpq,
so there's no match in the name.  I'm going to rename the whole thing to
src/test/modules/libpq_pipeline/ and see if the msvc tooling likes that
better.


-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: PROXY protocol support

2021-03-04 Thread Jacob Champion
On Thu, 2021-03-04 at 10:42 +0900, Tatsuo Ishii wrote:
> Is there any formal specification for the "a protocol common and very
> light weight in proxies"?

See

https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt

which is maintained by HAProxy Technologies.

--Jacob


  1   2   >