date:20220304

Re: ltree_gist indexes broken after pg_upgrade from 12 to 13

2022-03-04 Thread Tomas Vondra

On 3/4/22 23:09, Nikita Glukhov wrote:
> On 04.03.2022 23:28, Tom Lane wrote:
> 
>> Tomas Vondra  writes:
>>> On 3/4/22 20:29, Nikita Glukhov wrote:
 So, we probably have corrupted indexes that were updated since such 
 "incomplete" upgrade of ltree.
>>> IIRC pg_upgrade is not expected to upgrade extensions - it keeps the
>>> installed version of the extension, and that's intentional.
>> Yeah, exactly.  But this opens up an additional consideration we
>> have to account for: whatever we do needs to work with either 1.1
>> or 1.2 SQL-level versions of the extension.
>>
>>  regards, tom lane
> 
> It becomes clear that ltree upgrade 1.1 => 1.2 is broken, the problem 
> is not so much related to PG12 => PG13+ upgrades.
> 

Well, it's quite a mess :-(

It very clearly is not just 1.1 -> 1.2 upgrade issue, because you can
get a crash with 1.1 after PG12 -> PG13 upgrade, as demonstrated
earlier. So just "fixing" the extension upgrade is no enough.

But as you showed, 1.1 -> 1.2 upgrade is broken too.

> 
> You can see here another problem: installation of opclass options 
> procedure does not invalidate relation's opclass options cache.
>

Seems like that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PoC] Let libpq reject unexpected authentication requests

2022-03-04 Thread Michael Paquier

On Fri, Mar 04, 2022 at 08:19:26PM -0500, Tom Lane wrote:
> Jacob Champion  writes:
>> Here is my take on option 2, then: you get to choose exactly one method
>> that the client will accept. If you want to use client certificates,
>> use require_auth=cert. If you want to force SCRAM, use
>> require_auth=scram-sha-256. If the server asks for something different,
>> libpq will fail. If the server tries to get away without asking you for
>> authentication, libpq will fail. There is no negotiation.

Fine by me to put all the control on the client-side, that makes the
whole much simpler to reason about.

> Seems reasonable, but I bet that for very little more code you could
> accept a comma-separated list of allowed methods; libpq already allows
> comma-separated lists for some other connection options.  That seems
> like it'd be a useful increment of flexibility.

Same impression here, so +1 for supporting a comma-separated list of
values here.  This is already handled in parse_comma_separated_list(),
now used for multiple hosts and hostaddrs.
--
Michael


signature.asc
Description: PGP signature

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

On 2022-03-04 20:06:43 -0800, Andres Freund wrote:
> On 2022-03-05 16:39:21 +1300, Thomas Munro wrote:
> > I vote for committing that workaround into the tree temporarily,
> > because it's not just cfbot, it's also everyone's dev branches on
> > Github + the official mirror that are red.
> 
> I'll do so after making dinner, unless you want to do so sooner. It did fix
> the problem (intermixed with a few irrelevant changes): 
> https://cirrus-ci.com/task/4928987829895168

Pushed.

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-05 16:39:21 +1300, Thomas Munro wrote:
> On Sat, Mar 5, 2022 at 4:19 PM Andres Freund  wrote:
> > https://github.com/cirruslabs/cirrus-ci-agent/issues/218#issuecomment-1059657781
> 
> Oof.  Nice detective work.

Thanks.

> > As indicated in the message above it, there's a workaround. Not sure if 
> > worth
> > committing, if they were to fix it in a few days?  cfbot could be repaired 
> > by
> > just adding a repo environment variable of CIRRUS_AGENT_VERSION 1.73.2...
> > OTOH, committing and reverting a single line + comment is cheap.
> 
> I vote for committing that workaround into the tree temporarily,
> because it's not just cfbot, it's also everyone's dev branches on
> Github + the official mirror that are red.

I'll do so after making dinner, unless you want to do so sooner. It did fix
the problem (intermixed with a few irrelevant changes): 
https://cirrus-ci.com/task/4928987829895168

- Andres

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Thomas Munro

On Sat, Mar 5, 2022 at 4:19 PM Andres Freund  wrote:
> https://github.com/cirruslabs/cirrus-ci-agent/issues/218#issuecomment-1059657781

Oof.  Nice detective work.

> As indicated in the message above it, there's a workaround. Not sure if worth
> committing, if they were to fix it in a few days?  cfbot could be repaired by
> just adding a repo environment variable of CIRRUS_AGENT_VERSION 1.73.2...
> OTOH, committing and reverting a single line + comment is cheap.

I vote for committing that workaround into the tree temporarily,
because it's not just cfbot, it's also everyone's dev branches on
Github + the official mirror that are red.

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-05 13:21:26 +1300, Thomas Munro wrote:
> On Sat, Mar 5, 2022 at 10:56 AM Andres Freund  wrote:
> > I suspect one also needs the console detach thing.
>
> I tried adding DETACHED_PROCESS (which should be like calling
> FreeConsole() in child process) and then I tried CREATE_NEW_CONSOLE
> instead, on top of CREATE_NEW_PROCESS_GROUP.  Neither helped, though I
> lost the postmaster's output.

I think the issue with the process group is real, but independent of the
failing test.Locally just specifying CREATE_NEW_PROCESS_GROUP fixes the
problem of a pg_ctl start'ed database being stopped after ctrl-c.

I think CI failing is due to cirrus bug, assuming a bit too much similarity
between unix and windows

https://github.com/cirruslabs/cirrus-ci-agent/issues/218#issuecomment-1059657781

As indicated in the message above it, there's a workaround. Not sure if worth
committing, if they were to fix it in a few days?  cfbot could be repaired by
just adding a repo environment variable of CIRRUS_AGENT_VERSION 1.73.2...
OTOH, committing and reverting a single line + comment is cheap.

Greetings,

Andres Freund

Re: Adding CI to our tree (ccache)

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-03 22:56:15 -0600, Justin Pryzby wrote:
> On Sun, Feb 20, 2022 at 12:47:31PM -0800, Andres Freund wrote:
> > On 2022-02-20 13:36:55 -0600, Justin Pryzby wrote:
> > > Have you tried to use the yet-to-be-released ccache with MSVC ?
> > 
> > Yes, it doesn't work, because it requires cl.exe to be used in a specific 
> > way
> > (only a single input file, specific output file naming). Which would 
> > require a
> > decent amount of changes to src/tools/msvc. I think it's more realistic with
> > meson etc.
> 
> Did you get to the point that that causes a problem, or did you just realize
> that it was a limitation that seems to preclude its use ?

I tried to use it, but saw that no caching was happening, and debugged
it. Which yielded that it can't be used due to the way output files are
specified (and due to multiple files, but that can be prevented with an
msbuild parameter).

> If so, could you send the branch/commit you had ?

This was in a local VM, not cirrus. I ended up making it work with the meson
build, after a bit of fiddling. Although bypassing msbuild (by building with
ninja, using cl.exe) is a larger win...

> The error I'm getting when I try to use ccache involves .rst files, which 
> don't
> exist (and which ccache doesn't know how to find or ignore).
> https://cirrus-ci.com/task/5441491957972992

ccache has code to deal with response files. I suspect the problem here is
rather that ccache expects the compiler as an argument.

> I gather this is the difference between "compiling with MSVC" and compiling
> with a visual studio project.

I doubt it's related to that.

Greetings,

Andres Freund

Re: [PoC] Let libpq reject unexpected authentication requests

2022-03-04 Thread Tom Lane

Jacob Champion  writes:
> $subject keeps coming up in threads. I think my first introduction to
> it was after the TLS injection CVE, and then it came up again in the
> pluggable auth thread. It's hard for me to generalize based on "sound
> bites", but among the proposals I've seen are

> 1. reject plaintext passwords
> 2. reject a configurable list of unacceptable methods
> 3. allow client and server to negotiate a method

> All of them seem to have merit.

Agreed.

> Here is my take on option 2, then: you get to choose exactly one method
> that the client will accept. If you want to use client certificates,
> use require_auth=cert. If you want to force SCRAM, use
> require_auth=scram-sha-256. If the server asks for something different,
> libpq will fail. If the server tries to get away without asking you for
> authentication, libpq will fail. There is no negotiation.

Seems reasonable, but I bet that for very little more code you could
accept a comma-separated list of allowed methods; libpq already allows
comma-separated lists for some other connection options.  That seems
like it'd be a useful increment of flexibility.

regards, tom lane

Re: Add the replication origin name and commit-LSN to logical replication worker errcontext

2022-03-04 Thread Amit Kapila

On Fri, Mar 4, 2022 at 6:02 PM Euler Taveira  wrote:
>
> On Fri, Mar 4, 2022, at 2:54 AM, Amit Kapila wrote:
>
> The LSN of the transaction that contains the change violating the
> constraint and the replication origin name can be found from the
> server log (LSN 0/14C0378 and replication origin
> pg_16395 in the above case).  To skip the
> transaction, the subscription needs to be disabled temporarily by
> ALTER SUBSCRIPTION ... DISABLE first. Then, the
> transaction can be skipped by calling the  linkend="pg-replication-origin-advance">
> pg_replication_origin_advance() function
> with the node_name (i.e.,
> pg_16395) and the next LSN of the commit LSN (i.e.,
> LSN 0/14C0379).
>
> You could also add:
>
> After that the replication can be resumed by ALTER SUBSCRIPTION ...
> ENABLE.
>
> Let's provide complete instructions.
>

+1. That sounds reasonable to me.

-- 
With Regards,
Amit Kapila.

[PoC] Let libpq reject unexpected authentication requests

2022-03-04 Thread Jacob Champion

Hello,

TL;DR: this patch lets you specify exactly one authentication method in
the connection string, and libpq will fail the connection if the server
doesn't use that method.

(This is not intended for PG15. I'm generally anxious about posting
experimental work during a commitfest, but there's been enough
conversation about this topic recently that I felt like it'd be useful
to have code to point to.)

== Proposal and Alternatives ==

$subject keeps coming up in threads. I think my first introduction to
it was after the TLS injection CVE, and then it came up again in the
pluggable auth thread. It's hard for me to generalize based on "sound
bites", but among the proposals I've seen are

1. reject plaintext passwords
2. reject a configurable list of unacceptable methods
3. allow client and server to negotiate a method

All of them seem to have merit. I'm personally motivated by the case
brought up by the CVE: if I'm expecting client certificate
authentication, it's not acceptable for the server to extract _any_
information about passwords from my system, whether they're plaintext,
hashed, or SCRAM-protected. So I chose not to implement option 1. And
option 3 looked like a lot of work to take on in an experiment without
a clear consensus.

Here is my take on option 2, then: you get to choose exactly one method
that the client will accept. If you want to use client certificates,
use require_auth=cert. If you want to force SCRAM, use
require_auth=scram-sha-256. If the server asks for something different,
libpq will fail. If the server tries to get away without asking you for
authentication, libpq will fail. There is no negotiation.

== Why Force Authn? ==

I think my decision to fail if the server doesn't authenticate might be
controversial. It doesn't provide additional protection against active
attack unless you're using a mutual authentication method (SCRAM),
because you can't prove that the server actually did anything with its
side of the handshake. But this approach grew on me for a few reasons:

- When using SCRAM, it allows the client to force a server to
authenticate itself, even when channel bindings aren't being used. (I
really think it's weird that we let the server get away with that
today.)

- In cases where you want to ensure that your actions are logged for
later audit, you can be reasonably sure that you're connecting to a
database that hasn't been configured with a `trust` setting.

- For cert authentication, it ensures that the server asked for a cert
and that you actually sent one. This is more forward-looking; today, we
always ask for a certificate from the client even if we don't use it.
But if implicit TLS takes off, I'd expect to see more middleware, with
more potential for misconfiguration.

== General Thoughts ==

I like that this approach fits nicely into the existing code. The
majority of the patch just beefs up check_expected_areq(). The new flag
that tracks whether or not we've authenticated is scattered around more
than I would like, but I'm hopeful that some of the pluggable auth
conversations will lead to nice refactoring opportunities for those
internal helpers.

There's currently no way to prohibit client certificates from being
sent. If my use case is "servers shouldn't be able to extract password
info if the client expects certificates", then someone else may very
well say "servers shouldn't be able to extract my client certificate if
I wanted to use SCRAM". Likewise, this feature won't prevent a GSS
authenticated channel -- but we do have gssencmode=disable, so I'm less
concerned there.

I made the assumption that a GSS encrypted channel authenticates both
parties to each other, but I don't actually know what guarantees are
made there. I have not tested SSPI.

I'm not a fan of the multiple spellings of "password" ("ldap", "pam",
et al). My initial thought was that it'd be nice to match the client
setting to the HBA setting in the server. But I don't think it's really
all that helpful; worst-case, it pretends to do something it can't,
since the client can't determine what the plaintext password is
actually used for on the backend. And if someone devises (say) a SASL
scheme for proxied LDAP authentication, that spelling becomes obsolete.

Speaking of obsolete, the current implementation assumes that any SASL
exchange must be for SCRAM. That won't be anywhere close to future-
proof.

Thanks,
--Jacob
From 545a89aafacd0f997cd3e14cd20b192335eafadc Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Mon, 28 Feb 2022 09:40:43 -0800
Subject: [PATCH] libpq: let client reject unexpected auth methods

The require_auth connection option allows the client to choose one (and
only one) authentication type for use with the server. There is no
negotiation: if the server does not present the expected authentication
request, the connection fails.

Internally, the patch expands the role of check_expected_areq() to
ensure that the incoming request matches conn->require_auth. It also
introduce

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Thomas Munro

On Sat, Mar 5, 2022 at 10:56 AM Andres Freund  wrote:
> I suspect one also needs the console detach thing.

I tried adding DETACHED_PROCESS (which should be like calling
FreeConsole() in child process) and then I tried CREATE_NEW_CONSOLE
instead, on top of CREATE_NEW_PROCESS_GROUP.  Neither helped, though I
lost the postmaster's output.

Things I tried and their output are here:
https://github.com/macdice/postgres/commits/hack3

Re: Proposal: Support custom authentication methods using hooks,Re: Proposal: Support custom authentication methods using hooks

2022-03-04 Thread Tatsuo Ishii

>> I still don't understand why using plaintex password authentication
>> over SSL connection is considered insecure. Actually we have been
>> stating opposite in the manual:
>> https://www.postgresql.org/docs/14/auth-password.html
>>
>> "If the connection is protected by SSL encryption then password can be
>> used safely, though."
> 
> If you aren't doing client verification (i.e., cert in pg_hba) and are
> not doing verify-full on the client side then a man-in-the-middle
> attack on TLS is trivial, and the plaintext password will be
> sniffable.

So the plaintext password is safe if used with hostssl + verify-full
(server side) and sslmode = verify-full (client side), right?

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

On 2022-03-04 14:04:44 -0800, Andres Freund wrote:
> Trying to make it use the current user now - I don't know what
> permissions services are needed to run a service as a user or such.

My first attempt of using %USERNAME% failed:

[22:10:20.862] c:\cirrus>call tmp_install\bin\pg_ctl.exe register -D 
tmp_check/db "-UContainerAdministrator"
[22:10:20.889] pg_ctl: could not register service "PostgreSQL": error code 1057

seems to require a domain for some reason:
[22:33:54.599] c:\cirrus>call tmp_install\bin\pg_ctl.exe register 
-Dtmp_check/db -U "User Manager\ContainerAdministrator"

but that then doesn't start:
[22:33:54.660] c:\cirrus>call net start PostgreSQL
[22:33:55.887] System error 1068 has occurred.

So it indeed seems hard to start a service as the current user. At least with
my limited windows knowledge.

Re: role self-revocation

2022-03-04 Thread David G. Johnston

On Fri, Mar 4, 2022 at 1:50 PM Robert Haas  wrote:

> On Mon, Feb 28, 2022 at 2:09 PM Stephen Frost  wrote:
> > The ability of a role to revoke itself from some other role is just
> > something we need to accept as being a change that needs to be made, and
> > I do believe that such a change is supported by the standard, in that a
> > REVOKE will only work if you have the right to make it as the user who
> > performed the GRANT in the first place.
>
> First, a quick overview of the
> issue for those who have not followed the earlier threads in their
> grueling entirety:
>
> rhaas=# create user boss;
> CREATE ROLE
> rhaas=# create user peon;
> CREATE ROLE
> rhaas=# grant peon to boss;
> GRANT ROLE
> rhaas=# \c - peon
> You are now connected to database "rhaas" as user "peon".
> rhaas=> revoke peon from boss; -- i don't like being bossed around!
> REVOKE ROLE
>
>
The wording for this example is hurting my brain.
GRANT admin TO joe;
\c admin
REVOKE admin FROM joe;

> I argue (and Stephen seems to agree) that the peon shouldn't be able
> to undo the superuser's GRANT.

I think I disagree.  Or, at least, the superuser has full control of
dictating how role membership is modified and that seems sufficient.

The example above works because of:

"A role is not considered to hold WITH ADMIN OPTION on itself, but it may
grant or revoke membership in itself from a database session where the
session user matches the role."

If a superuser doesn't want "admin" to modify its own membership then they
can prevent anyone but a superuser from being able to have a session_user
of "admin".  If that happens then the only way a non-superuser can modify
group membership is by being added to said group WITH ADMIN OPTION.

Now, if two people and a superuser are all doing membership management on
the same group, and we want to add permission checks and multiple grants as
tools, instead of having them just communicate with each other, then by all
means let us do so.  In that case, in answer to questions 2 and 3, we
should indeed track which session_user made the grant and only allow the
same session_user or the superuser to revoke it (we'd want to stop
"ignoring" the GRANTED BY clause of REVOKE ROLE FROM so the superuser at
least could remove grants made via WITH ADMIN OPTION).

4. Should we apply this rule to other types of grants, rather than
> just to role membership? Why or why not? Consider this:
>

> The fact that the accountant may not be not in favor
> of the auditor seeing what the accountant is doing with the money is
> precisely the reason why we have auditors.

[...]

> However, when the superuser
> performs the GRANT as in the above example, the grantor is recorded as
> the table owner, not the superuser! So if we really want role
> membersip and other kinds of grants to behave in the same way, we have
> our work cut out for us here.
>

Yes, this particular choice seems unfortunate, but also not something that
I think it is necessarily mandatory for us to improve.  If the accountant
is the owner then yes they get to decide permissions.  In the presence of
an auditor role either you trust the accountant role to keep the
permissions in place or you define a superior authority to both the auditor
and accountant to be the owner.  Or let the superuser manage everything by
witholding login and WITH ADMIN OPTION privileges from the ownership role.

If we do extend role membership tracking I suppose the design question is
whether the new role grantor dependency tracking will have a superuser be
the recorded grantor instead of some owner.  Given that roles don't
presently have an owner concept, consistency with existing permissions in
this manner would be trickier.  Because of this, I would probably leave
role grantor tracking at the session_user level while database objects
continue to emanate from the object owner.  The global vs database
differences seem like a sufficient theoretical justification for the
difference in implementation.

David J.

Re: ltree_gist indexes broken after pg_upgrade from 12 to 13

2022-03-04 Thread Nikita Glukhov


On 04.03.2022 23:28, Tom Lane wrote:


Tomas Vondra  writes:

On 3/4/22 20:29, Nikita Glukhov wrote:

So, we probably have corrupted indexes that were updated since such
"incomplete" upgrade of ltree.

IIRC pg_upgrade is not expected to upgrade extensions - it keeps the
installed version of the extension, and that's intentional.

Yeah, exactly.  But this opens up an additional consideration we
have to account for: whatever we do needs to work with either 1.1
or 1.2 SQL-level versions of the extension.

regards, tom lane


It becomes clear that ltree upgrade 1.1 => 1.2 is broken, the problem
is not so much related to PG12 => PG13+ upgrades.


The following examples crashes PG13+:

CREATE EXTENSION ltree VERSION "1.1";

CREATE TABLE test AS
SELECT i::text::ltree data
FROM generate_series(1, 10) i;

CREATE INDEX ON test USING gist (data);

ALTER EXTENSION ltree UPDATE TO "1.2";

-- works, cached bytea options is still NULL and siglen is 28
SELECT * FROM test WHERE data = '12345';

\connect

-- crash, siglen = 8
SELECT * FROM test WHERE data = '12345';



You can see here another problem: installation of opclass options
procedure does not invalidate relation's opclass options cache.


--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Tom Lane

Andres Freund  writes:
> I don't really understand why start_postmaster() bothers to wrap postmaster
> start through cmd.exe, particularly when it prevents us from getting
> postmaster's pid. Also the caveats around cmd.exe and sharing mode.

I think the basic idea is to avoid having to write our own code to do
the I/O redirections --- that's certainly why we use a shell on the
Unix side.  But it might well be that biting the bullet and handling
redirection ourselves would be easier than coping with all these
other problems.

regards, tom lane

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 16:51:32 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > That fixed the immediate problem, but dblink, postgres_fdw tests failed:
> > +ERROR:  could not establish connection
> > +DETAIL:  connection to server at "localhost" (::1), port 5432 failed: 
> > FATAL:
> >  role "SYSTEM" does not exist
>
> [ scratches head... ]  Where is libpq getting that username from?
> Why is it different from whatever we'd determined during initdb?
> (Maybe a case-folding issue??)

When running as a service (via pg_ctl register) the default username to run
under appears to be SYSTEM. Which then differs from the user that vcregress.pl
runs under. Trying to make it use the current user now - I don't know what
permissions services are needed to run a service as a user or such.

> > The dblink and fdw failures can presumably be fixed by passing current_user 
> > as
> > the username. That seems like a good idea anyway?
>
> I don't think it's a good idea to hack that without understanding
> why it's suddenly going wrong.

I think I understand why - seems more a question of whether we want to support
running tests against a server running as a different user.

Greetings,

Andres Freund

Re: Avoiding smgrimmedsync() during nbtree index builds

2022-03-04 Thread Melanie Plageman

On Wed, Mar 2, 2022 at 8:09 PM Justin Pryzby  wrote:
>
> Rebased to appease cfbot.
>
> I ran these paches under a branch which shows code coverage in cirrus.  It
> looks good to my eyes.
> https://api.cirrus-ci.com/v1/artifact/task/5212346552418304/coverage/coverage/00-index.html

thanks for doing that and for the rebase! since I made updates, the
attached version 6 is also rebased.

To Dmitry's question:

On Sun, Feb 13, 2022 at 9:33 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Wed, Feb 09, 2022 at 01:49:30PM -0500, Melanie Plageman wrote:
>
> I don't see it in the discussion, so naturally curious -- why directmgr
> is not used for bloom index, e.g. in blbuildempty?

thanks for pointing this out. blbuildempty() is also included now. bloom
doesn't seem to use smgr* anywhere except blbuildempty(), so that is the
only place I made changes in bloom index build.

v6 has the following updates/changes:

- removed erroneous extra calls to unbuffered_prep() and
  unbuffered_finish() in GiST and btree index builds

- removed unnecessary includes

- some comments were updated to accurately reflect use of directmgr

- smgrwrite doesn't WAL-log anymore. This one I'm not sure about. I
  think it makes sense that unbuffered_extend() of non-empty pages of
  WAL-logged relations (or the init fork of unlogged relations) do
  log_newpage(), but I wasn't so sure about smgrwrite().

  Currently all callers of smgrwrite() do log_newpage() and anyone using
  mdwrite() will end up writing the whole page. However, it seems
  possible that a caller of the directmgr API might wish to do a write
  to a particular offset and, either because of that, or, for some other
  reason, require different logging than that done in log_newpage().

  I didn't want to make a separate wrapper function for WAL-logging in
  directmgr because it felt like one more step to forget.

- heapam_relation_set_new_filenode doesn't use directmgr API anymore for
  unlogged relations. It doesn't extend or write, so it seemed like a
  special case better left alone.

  Note that the ambuildempty() functions which also write to the init
  fork of an unlogged relation still use the directmgr API. It is a bit
  confusing because they pass do_wal=true to unbuffered_prep() even
  though they are unlogged relations because they must log and fsync.

- interface changes to unbuffered_prep()
  I removed the parameters to unbuffered_prep() which required the user
  to specify if fsync should be added to pendingOps or done with
  smgrimmedsync(). Understanding all of the combinations of these
  parameters and when they were needed was confusing and the interface
  felt like a foot gun. Special cases shouldn't use this interface.

  I prefer the idea that users of this API expect that
  1) empty pages won't be checksummed or WAL logged
  2) for relations that are WAL-logged, when the build is done, the
  relation will be fsync'd by the backend (unless the fsync optimization
  is being used)
  3) the only case in which fsync requests are added to the pendingOps
  queue is when the fsync optimization is being used (which saves the
  redo pointer and check it later to determine if it needs to fsync
  itself)

  I also added the parameter "do_wal" to unbuffered_prep() and the
  UnBufferedWriteState struct. This is used when extending the file to
  determine whether or not to log_newpage(). unbuffered_extend() and
  unbuffered_write() no longer take do_wal as a parameter.

  Note that callers need to pass do_wal=true/false to unbuffered_prep()
  based on whether or not they want log_newpage() called during
  unbuffered_extend()--not simply based on whether or not the relation
  in question is WAL-logged.

  do_wal is the only member of the UnBufferedWriteState struct in the
  first patch in the set, but I think it makes sense to keep the struct
  around since I anticipate that the patch containing the other members
  needed for the fsync optimization will be committed at the same time.

  One final note on unbuffered_prep() -- I am thinking of renaming
  "do_optimization" to "try_optimization" or maybe
  "request_fsync_optimization". The interface (of unbuffered_prep())
  would be better if we always tried to do the optimization when
  relevant (when the relation is WAL-logged).

- freespace map, visimap, and hash index don't use directmgr API anymore
  For most use cases writing/extending outside shared buffers, it isn't
  safe to rely on requesting fsync from checkpointer.

  visimap, fsm, and hash index all request fsync from checkpointer for
  the pages they write with smgrextend() and don't smgrimmedsync() when
  finished adding pages (even when the hash index is WAL-logged).

  Supporting these exceptions made the interface incoherent, so I cut
  them.

- added unbuffered_extend_range()
  This one is a bit unfortunate. Because GiST index build uses
  log_newpages() to log a whole page range but calls smgrextend() for
  each of those pages individually, I couldn

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-05 09:29:01 +1300, Thomas Munro wrote:
> On Sat, Mar 5, 2022 at 7:10 AM Andres Freund  wrote:
> > Or perhaps pg_ctl ought to pass CREATE_NEW_PROCESS_GROUP to CreateProcess()?
> > The lack of a process group would explain why we're getting signalled on
> > ctrl-c...
> 
> I thought that sounded promising, given that the recent Cirrus agent
> commit you pointed to says "Always kill spawned shell's process group
> to avoid pipe FD hangs", and given that we do something conceptually
> similar on Unix.  It doesn't seem to help, though...
> 
> https://cirrus-ci.com/task/5572163880091648

I suspect one also needs the console detach thing.

I don't really understand why start_postmaster() bothers to wrap postmaster
start through cmd.exe, particularly when it prevents us from getting
postmaster's pid. Also the caveats around cmd.exe and sharing mode.

Greetings,

Andres Freund

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Tom Lane

Andres Freund  writes:
> That fixed the immediate problem, but dblink, postgres_fdw tests failed:
> +ERROR:  could not establish connection
> +DETAIL:  connection to server at "localhost" (::1), port 5432 failed: FATAL:
>  role "SYSTEM" does not exist

[ scratches head... ]  Where is libpq getting that username from?
Why is it different from whatever we'd determined during initdb?
(Maybe a case-folding issue??)

> The dblink and fdw failures can presumably be fixed by passing current_user as
> the username. That seems like a good idea anyway?

I don't think it's a good idea to hack that without understanding
why it's suddenly going wrong.

regards, tom lane

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 09:57:35 -0800, Andres Freund wrote:
> On 2022-03-04 09:46:44 -0800, Andres Freund wrote:
> > On 2022-03-04 09:30:37 -0800, Andres Freund wrote:
> > > I wonder if we're missing some steps, at least on windows, to make pg_ctl
> > > start independent of the starting shell?
> >
> > Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the
> > server shuts down.
> 
> Short term the easiest fix might be to start postgres for those tests as a
> service. But it seems we should fix whatever the cause of that
> terminal-connectedness behaviour is.
> 
> I'm out for ~2-3h. I started a test run with using a service just now:
> https://cirrus-ci.com/task/5519573792325632 but I very well might have typoed
> something...

That fixed the immediate problem, but dblink, postgres_fdw tests failed:

https://api.cirrus-ci.com/v1/artifact/task/5519573792325632/log/contrib/dblink/regression.diffs
 FROM dblink(connection_parameters(),'SELECT * FROM foo') AS t(a int, b text, c 
text[])
 WHERE t.a > 7;
- a | b | c  
+---+
- 8 | i | {a8,b8,c8}
- 9 | j | {a9,b9,c9}
-(2 rows)
-
+ERROR:  could not establish connection
+DETAIL:  connection to server at "localhost" (::1), port 5432 failed: FATAL:
 role "SYSTEM" does not exist


https://api.cirrus-ci.com/v1/artifact/task/5519573792325632/log/contrib/postgres_fdw/regression.diffs
+ERROR:  could not connect to server "loopback"

and it also seems to redirect logging to the event log without further
configuration...

The dblink and fdw failures can presumably be fixed by passing current_user as
the username. That seems like a good idea anyway?

Greetings,

Andres Freund

Re: role self-revocation

2022-03-04 Thread Tom Lane

Robert Haas  writes:
> 1. What should be the exact rule for whether A can remove a grant made
> by B? Is it has_privs_of_role()? is_member_of_role()? Something else?

No strong opinion here, but I'd lean slightly to the more restrictive
option.

> 2. What happens if the same GRANT is enacted by multiple users? For
> example, suppose peon does "GRANT peon to boss" and then the superuser
> does the same thing afterwards, or vice versa? One design would be to
> try to track those as two separate grants, but I'm not sure if we want
> to add that much complexity, since that's not how we do it now and it
> would, for example, implicate the choice of PK on the pg_auth_members
> table.

As you note later, we *do* track such grants separately in ordinary
ACLs, and I believe this is clearly required by the SQL spec.
It says (for privileges on objects):

Each privilege is represented by a privilege descriptor.
A privilege descriptor contains:
— The identification of the object on which the privilege is granted.
— The  of the grantor of the privilege.
  ^^^
— The  of the grantee of the privilege.
— Identification of the action that the privilege allows.
— An indication of whether or not the privilege is grantable.
— An indication of whether or not the privilege has the WITH HIERARCHY 
OPTION specified.

Further down (4.42.3 in SQL:2021), the granting of roles is described,
and that says:

Each role authorization is described by a role authorization descriptor.
A role authorization descriptor includes:
— The role name of the role.
— The authorization identifier of the grantor.

— The authorization identifier of the grantee.
— An indication of whether or not the role authorization is grantable.

If we are not tracking the grantors of role authorizations,
then we are doing it wrong and we ought to fix that.

> 3. What happens if a user is dropped after being recorded as a
> grantor?

Should work the same as it does now for ordinary ACLs, ie, you
gotta drop the grant first.

> 4. Should we apply this rule to other types of grants, rather than
> just to role membership?

I am not sure about the reasoning behind the existing rule that
superuser-granted privileges are recorded as being granted by the
object owner.  It does feel more like a wart than something we want.
It might have been a hack to deal with the lack of GRANTED BY
options in GRANT/REVOKE back in the day.

Changing it could have some bad compatibility consequences though.
In particular, I believe it would break existing pg_dump files,
in that after restore all privileges would be attributed to the
restoring superuser, and there'd be no very easy way to clean that
up.

> Please note that it is not really my intention to try to shove
> anything into v15 here.

Agreed, this is not something to move on quickly.  We might want
to think about adjusting pg_dump to use explicit GRANTED BY
options in GRANT/REVOKE a release or two before making incompatible
changes.

regards, tom lane

role self-revocation

2022-03-04 Thread Robert Haas

On Mon, Feb 28, 2022 at 2:09 PM Stephen Frost  wrote:
> The ability of a role to revoke itself from some other role is just
> something we need to accept as being a change that needs to be made, and
> I do believe that such a change is supported by the standard, in that a
> REVOKE will only work if you have the right to make it as the user who
> performed the GRANT in the first place.

Moving this part of the discussion to a new thread to reduce confusion
and hopefully get broader input on this topic. It seems like Stephen
and I agree in principle that some change here is a good idea. If
anyone else thinks it's a bad idea, then this would be a great time to
mention that, ideally with reasons. If you agree that it's a good
idea, then it would be great to have your views on the follow-up
questions which I shall pose below. To the extent that it is
reasonably possible to do so, I would like to try to keep focused on
specific design questions rather than getting tangled up in general
discussion of long-term direction. First, a quick overview of the
issue for those who have not followed the earlier threads in their
grueling entirety:

rhaas=# create user boss;
CREATE ROLE
rhaas=# create user peon;
CREATE ROLE
rhaas=# grant peon to boss;
GRANT ROLE
rhaas=# \c - peon
You are now connected to database "rhaas" as user "peon".
rhaas=> revoke peon from boss; -- i don't like being bossed around!
REVOKE ROLE

I argue (and Stephen seems to agree) that the peon shouldn't be able
to undo the superuser's GRANT. Furthermore, we also seem to agree that
you don't necessarily have to be the exact user who performed the
grant. For example, it would be shocking if one superuser couldn't
remove a grant made by another superuser, or for that matter if a
superuser couldn't remove a grant made by a non-superuser. But there
are a few open questions in my mind:

1. What should be the exact rule for whether A can remove a grant made
by B? Is it has_privs_of_role()? is_member_of_role()? Something else?

2. What happens if the same GRANT is enacted by multiple users? For
example, suppose peon does "GRANT peon to boss" and then the superuser
does the same thing afterwards, or vice versa? One design would be to
try to track those as two separate grants, but I'm not sure if we want
to add that much complexity, since that's not how we do it now and it
would, for example, implicate the choice of PK on the pg_auth_members
table. An idea that occurs to me is to say that the first GRANT works
and becomes the grantor of record, and any duplicate GRANT that
happens later issues a NOTICE without changing anything. If the user
performing the later GRANT has sufficient privileges and wishes to do
so, s/he can REVOKE first and then re-GRANT. On the other hand, for
other types of grants, like table privileges, we do track multiple
grants by different users, so maybe we should do the same thing here:

rhaas=# create table example (a int, b int);
CREATE TABLE
rhaas=# grant select on table example to foo with grant option;
GRANT
rhaas=# grant select on table example to bar with grant option;
GRANT
rhaas=# \c - foo
You are now connected to database "rhaas" as user "foo".
rhaas=> grant select on table example to exemplar;
GRANT
rhaas=> \c - bar
You are now connected to database "rhaas" as user "bar".
rhaas=> grant select on table example to exemplar;
GRANT
rhaas=> select relacl from pg_class where relname = 'example';
relacl
---
 {rhaas=arwdDxt/rhaas,foo=r*/rhaas,bar=r*/rhaas,exemplar=r/foo,exemplar=r/bar}
(1 row)

3. What happens if a user is dropped after being recorded as a
grantor? We actually have a grantor column in pg_auth_members today,
but it's not properly maintained. If the grantor is dropped the OID
remains in the table, and could eventually end up pointing to some
other user if the OID counter wraps around and a new role is created
with the same OID. That's completely unacceptable for something we
want to use for any serious purpose. I suggest that what ought to
happen is the role should acquire a dependency on the grant, such that
DROP fails and the GRANT is listed as something to be dropped, and
DROP OWNED BY drops the GRANT. I think this would require adding an
OID column to pg_auth_members so that a dependency can point to it,
which sounds like a significant infrastructure change that would need
to be carefully validated for adverse side effects, but not a huge
crazy problem that we can't get past.

4. Should we apply this rule to other types of grants, rather than
just to role membership? Why or why not? Consider this:

rhaas=# create user accountant;
CREATE ROLE
rhaas=# create user auditor;
CREATE ROLE
rhaas=# create table money (a int, b text);
CREATE TABLE
rhaas=# alter table money owner to accountant;
ALTER TABLE
rhaas=# grant select on table money to auditor;
GRANT
rhaas=# \c - accountant
You are now connected to database "r

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Thomas Munro

On Sat, Mar 5, 2022 at 7:10 AM Andres Freund  wrote:
> Or perhaps pg_ctl ought to pass CREATE_NEW_PROCESS_GROUP to CreateProcess()?
> The lack of a process group would explain why we're getting signalled on
> ctrl-c...

I thought that sounded promising, given that the recent Cirrus agent
commit you pointed to says "Always kill spawned shell's process group
to avoid pipe FD hangs", and given that we do something conceptually
similar on Unix.  It doesn't seem to help, though...

https://cirrus-ci.com/task/5572163880091648

Re: ltree_gist indexes broken after pg_upgrade from 12 to 13

2022-03-04 Thread Tom Lane

Tomas Vondra  writes:
> On 3/4/22 20:29, Nikita Glukhov wrote:
>> So, we probably have corrupted indexes that were updated since such 
>> "incomplete" upgrade of ltree.

> IIRC pg_upgrade is not expected to upgrade extensions - it keeps the
> installed version of the extension, and that's intentional.

Yeah, exactly.  But this opens up an additional consideration we
have to account for: whatever we do needs to work with either 1.1
or 1.2 SQL-level versions of the extension.

regards, tom lane

Re: SQL/JSON: JSON_TABLE

2022-03-04 Thread Andrew Dunstan



On 3/4/22 13:13, Erikjan Rijkers wrote:
> Op 04-03-2022 om 17:33 schreef Andrew Dunstan:
>>
>
>> This set of patches deals with items 1..7 above, but not yet the ERROR
>> ON ERROR issue. It also makes some message cleanups, but there is more
>> to come in that area.
>>
>> It is based on the latest SQL/JSON Functions patch set, which does not
>> include the sql_json GUC patch.
>>
> > [0001-SQL-JSON-functions-without-sql_json-GUC-v56.patch]
> > [0002-JSON_TABLE-v56.patch]
> > [0003-JSON_TABLE-PLAN-DEFAULT-clause-v56.patch]
> > [0004-JSON_TABLE-PLAN-clause-v56.patch]
>
> Hi,
>
> I quickly tried the tests I already had and there are two statements
> that stopped working:
>
> testdb=# SELECT JSON('{"a": 123, "b": [true, "foo"], "a2": "bar"}'
> RETURNING jsonb);
> ERROR:  syntax error at or near "RETURNING"
> LINE 1: ...SON('{"a": 123, "b": [true, "foo"], "a2": "bar"}' RETURNING
> ...
>
> testdb=# select JSON_SCALAR(123.45 returning jsonb);
> ERROR:  syntax error at or near "returning"
> LINE 1: select JSON_SCALAR(123.45 returning jsonb)
>
>   (the '^' pointer in both cases underneath 'RETURNING'
>
>
>


Yes, you're right, that was implemented as part of the GUC patch. I'll
try to split that out and send new patchsets with the RETURNING clause
but without the GUC (see upthread for reasons)


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: ltree_gist indexes broken after pg_upgrade from 12 to 13

2022-03-04 Thread Tomas Vondra



On 3/4/22 20:29, Nikita Glukhov wrote:
> On 25.02.2022 00:15, Tomas Vondra wrote:
> 
>> Hi,
>>
>> Victor Yegorov reported a crash related to a GiST index on ltree [1],
>> following a pg_upgrade from 12.x to 14.x, with a data set reproducing
>> this. I spent some time investigating this, and it seems this is a silly
>> bug in commit
>>
>>   commit 911e70207703799605f5a0e8aad9f06cff067c63 (HEAD)
>>   Author: Alexander Korotkov 
>>   Date:   Mon Mar 30 19:17:11 2020 +0300
>>
>> Implement operator class parameters
>> ...
>>
>> in PG13, which modified ltree_gist so that siglen is opclass parameter
>> (and not hard-coded). But the procedures use LTREE_GET_ASIGLEN macro to
>> extract the value, which either extracts the value from the catalog (if
>> the index has opclass parameter) or uses a default value - but it always
>> uses LTREE_ASIGLEN_DEFAULT, which belongs to _ltree_gist opclass (i.e.
>> for array of ltree). And that's 28 instead of the 8, as it should be.
> 
> It seems that ltree extension simply was not updated to v1.2 after 
> pg_upgrade:
>   ALTER EXTENSION ltree UPDATE TO '1.2'; -- is missing
> 
> Upgrade script ltree--1.1--1.2.sql creates ltree_gist_options() and 
> registers it in the opclass.  ltree_gist_options() initializes bytea 
> options using the correct SIGLEN_DEFAULT=8.
> 
> If ltree_gist_options() is absent, LTREE_GET_ASIGLEN() receives NULL
> and wrong LTREE_ASIGLEN_DEFAULT is used.  But if ltree_gist_options()
> is registered, LTREE_GET_ASIGLEN() receives non-NULL bytea options 
> with the correct default value.
>
> 
> So, we probably have corrupted indexes that were updated since such 
> "incomplete" upgrade of ltree.
> 

IIRC pg_upgrade is not expected to upgrade extensions - it keeps the
installed version of the extension, and that's intentional.

Moreover, it's perfectly legal to install older version, so you can do

  CREATE EXTENSION ltree VERSION '1.1';

So I don't think we can call this "incomplete upgrade".

> 
> Also I found that contrib/pg_trgm/trgm_gist.c uses wrongly named macro
> LTREE_GET_ASIGLEN().

Does it? When I grep for LTREE_GET_ASIGLEN, I don't get any matches in
pg_trgm.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Allow async standbys wait for sync replication

2022-03-04 Thread Nathan Bossart

On Wed, Mar 02, 2022 at 09:47:09AM +0530, Bharath Rupireddy wrote:
> On Wed, Mar 2, 2022 at 2:57 AM Nathan Bossart  
> wrote:
>> I think there are a couple of advantages.  For one, spinning is probably
>> not the best from a resource perspective.
> 
> Just to be on the same page - by spinning do you mean - the async
> walsender waiting for the sync flushLSN in a for-loop with
> WaitLatch()?

Yes.

>> Also, this approach might fit in better
>> with the existing synchronous replication framework.  When a WAL sender
>> realizes that it can't send up to the current "flush" LSN because it's not
>> synchronously replicated, it will request to be alerted when it is.
> 
> I think you are referring to the way a backend calls SyncRepWaitForLSN
> and waits until any one of the walsender sets syncRepState to
> SYNC_REP_WAIT_COMPLETE in SyncRepWakeQueue. Firstly, SyncRepWaitForLSN
> blocking i.e. the backend spins/waits in for (;;) loop until its
> syncRepState becomes SYNC_REP_WAIT_COMPLETE. The backend doesn't do
> any other work but waits. So, spinning isn't avoided completely.
> 
> Unless, I'm missing something, the existing syc repl queue
> (SyncRepQueue) mechanism doesn't avoid spinning in the requestors
> (backends) SyncRepWaitForLSN or in the walsenders SyncRepWakeQueue.

My point is that there are existing tools for alerting processes when an
LSN is synchronously replicated and for waking up WAL senders.  What I am
proposing wouldn't involve spinning in XLogSendPhysical() waiting for
synchronous replication.  Like SyncRepWaitForLSN(), we'd register our LSN
in the queue (SyncRepQueueInsert()), but we wouldn't sit in a separate loop
waiting to be woken.  Instead, SyncRepWakeQueue() would eventually wake up
the WAL sender and trigger another iteration of WalSndLoop().

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: ltree_gist indexes broken after pg_upgrade from 12 to 13

2022-03-04 Thread Nikita Glukhov


On 25.02.2022 00:15, Tomas Vondra wrote:


Hi,

Victor Yegorov reported a crash related to a GiST index on ltree [1],
following a pg_upgrade from 12.x to 14.x, with a data set reproducing
this. I spent some time investigating this, and it seems this is a silly
bug in commit

   commit 911e70207703799605f5a0e8aad9f06cff067c63 (HEAD)
   Author: Alexander Korotkov 
   Date:   Mon Mar 30 19:17:11 2020 +0300

 Implement operator class parameters
 ...

in PG13, which modified ltree_gist so that siglen is opclass parameter
(and not hard-coded). But the procedures use LTREE_GET_ASIGLEN macro to
extract the value, which either extracts the value from the catalog (if
the index has opclass parameter) or uses a default value - but it always
uses LTREE_ASIGLEN_DEFAULT, which belongs to _ltree_gist opclass (i.e.
for array of ltree). And that's 28 instead of the 8, as it should be.


It seems that ltree extension simply was not updated to v1.2 after
pg_upgrade:
  ALTER EXTENSION ltree UPDATE TO '1.2'; -- is missing

Upgrade script ltree--1.1--1.2.sql creates ltree_gist_options() and
registers it in the opclass.  ltree_gist_options() initializes bytea
options using the correct SIGLEN_DEFAULT=8.

If ltree_gist_options() is absent, LTREE_GET_ASIGLEN() receives NULL
and wrong LTREE_ASIGLEN_DEFAULT is used.  But if ltree_gist_options()
is registered, LTREE_GET_ASIGLEN() receives non-NULL bytea options
with the correct default value.


So, we probably have corrupted indexes that were updated since such
"incomplete" upgrade of ltree.



Also I found that contrib/pg_trgm/trgm_gist.c uses wrongly named macro
LTREE_GET_ASIGLEN().

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: Proposal: Support custom authentication methods using hooks

2022-03-04 Thread Jacob Champion

On Thu, 2022-03-03 at 11:12 +0100, Peter Eisentraut wrote:
> At the moment, it is not possible to judge whether the hook interface 
> you have chosen is appropriate.
> 
> I suggest you actually implement the Azure provider, then make the hook 
> interface, and then show us both and we can see what to do with it.

To add a data point here, I've rebased my OAUTHBEARER experiment [1] on
top of this patchset. (That should work with Azure's OIDC provider, and
if it doesn't, I'd like to know why.)

After the port, here are the changes I still needed to carry in the
backend to get the tests passing:

- I needed to add custom HBA options to configure the provider.
- I needed to declare usermap support so that my provider could
actually use check_usermap().
- I had to modify the SASL mechanism registration to allow a custom
maximum message length, but I think that's not the job of Samay's
proposal to fix; it's just a needed improvement to CheckSASLAuth().

Obviously, the libpq frontend still needs to understand how to speak
the new SASL mechanism. There are third-party SASL implementations that
are plugin-based, which could potentially ease the pain here, at the
expense of a major dependency and a very new distribution model.

--Jacob

[1] 
https://www.postgresql.org/message-id/flat/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel%40vmware.com

Re: Proposal: Support custom authentication methods using hooks,Re: Proposal: Support custom authentication methods using hooks

2022-03-04 Thread Joshua Brindle

On Thu, Mar 3, 2022 at 11:50 PM Tatsuo Ishii  wrote:
>
> >> So, dropping plaintext password authentication support from libpq will
> >> make it impossible for users to use the former method.
> >
> > Yes, just like dropping support for md5 would make it impossible for
> > users to have their passwords be hashed with md5, which is an altogether
> > good thing considering how easy it is to brute-force md5 these days.
>
> I still don't understand why using plaintex password authentication
> over SSL connection is considered insecure. Actually we have been
> stating opposite in the manual:
> https://www.postgresql.org/docs/14/auth-password.html
>
> "If the connection is protected by SSL encryption then password can be
> used safely, though."

If you aren't doing client verification (i.e., cert in pg_hba) and are
not doing verify-full on the client side then a man-in-the-middle
attack on TLS is trivial, and the plaintext password will be
sniffable.

The documentation should be updated to say under no circumstances is this safe.

Re: SQL/JSON: JSON_TABLE

2022-03-04 Thread Erikjan Rijkers


Op 04-03-2022 om 17:33 schreef Andrew Dunstan:





This set of patches deals with items 1..7 above, but not yet the ERROR
ON ERROR issue. It also makes some message cleanups, but there is more
to come in that area.

It is based on the latest SQL/JSON Functions patch set, which does not
include the sql_json GUC patch.


> [0001-SQL-JSON-functions-without-sql_json-GUC-v56.patch]
> [0002-JSON_TABLE-v56.patch]
> [0003-JSON_TABLE-PLAN-DEFAULT-clause-v56.patch]
> [0004-JSON_TABLE-PLAN-clause-v56.patch]

Hi,

I quickly tried the tests I already had and there are two statements 
that stopped working:


testdb=# SELECT JSON('{"a": 123, "b": [true, "foo"], "a2": "bar"}' 
RETURNING jsonb);

ERROR:  syntax error at or near "RETURNING"
LINE 1: ...SON('{"a": 123, "b": [true, "foo"], "a2": "bar"}' RETURNING ...

testdb=# select JSON_SCALAR(123.45 returning jsonb);
ERROR:  syntax error at or near "returning"
LINE 1: select JSON_SCALAR(123.45 returning jsonb)

  (the '^' pointer in both cases underneath 'RETURNING'


thanks,

Erik Rijkers







cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 09:57:35 -0800, Andres Freund wrote:
> On 2022-03-04 09:46:44 -0800, Andres Freund wrote:
> > On 2022-03-04 09:30:37 -0800, Andres Freund wrote:
> > > I wonder if we're missing some steps, at least on windows, to make pg_ctl
> > > start independent of the starting shell?
> >
> > Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the
> > server shuts down.
> 
> Short term the easiest fix might be to start postgres for those tests as a
> service. But it seems we should fix whatever the cause of that
> terminal-connectedness behaviour is.
> 
> I'm out for ~2-3h. I started a test run with using a service just now:
> https://cirrus-ci.com/task/5519573792325632 but I very well might have typoed
> something...

Seems to have worked for the first few tests at least. Unless somebody wants
to clean up that commit and push it, I'll do so once I'm back.


Perhaps pg_ctl needs to call FreeConsole() or such?

https://docs.microsoft.com/en-us/windows/console/closing-a-console

Or perhaps pg_ctl ought to pass CREATE_NEW_PROCESS_GROUP to CreateProcess()?
The lack of a process group would explain why we're getting signalled on
ctrl-c...

Greetings,

Andres Freund

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

On 2022-03-04 09:46:44 -0800, Andres Freund wrote:
> On 2022-03-04 09:30:37 -0800, Andres Freund wrote:
> > I wonder if we're missing some steps, at least on windows, to make pg_ctl
> > start independent of the starting shell?
>
> Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the
> server shuts down.

Short term the easiest fix might be to start postgres for those tests as a
service. But it seems we should fix whatever the cause of that
terminal-connectedness behaviour is.

I'm out for ~2-3h. I started a test run with using a service just now:
https://cirrus-ci.com/task/5519573792325632 but I very well might have typoed
something...

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 09:30:37 -0800, Andres Freund wrote:
> I wonder if we're missing some steps, at least on windows, to make pg_ctl
> start independent of the starting shell?

Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the
server shuts down.

Greetings,

Andres Freund

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Tom Lane

Andres Freund  writes:
> On 2022-03-04 12:23:43 -0500, Tom Lane wrote:
>> pg_regress thinks the postmaster is listening to a Unix socket.

> pg_regress outputs the above message when neither PGHOST, PGPORT or
> --host/--port are set. On windows that nevertheless ends up with connecting to
> localhost.

Yeah, I just traced that down.  pg_regress was not updated when libpq's
behavior was changed for platform-varying DEFAULT_PGSOCKET_DIR.
I'll go fix that, but you're right that it's cosmetic.

regards, tom lane

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 12:23:43 -0500, Tom Lane wrote:
> Actually ... looking closer at the failure log:
> 
> [09:41:50.589] Checking plpgsql
> [09:41:50.608] (using postmaster on Unix socket, default port)
> ^^^
> [09:41:50.608] == dropping database "pl_regression"  
> ==
> [09:41:52.731] psql: error: connection to server at "localhost" (::1), port 
> 5432 failed: Connection refused (0x274D/10061)
> [09:41:52.731]Is the server running on that host and accepting TCP/IP 
> connections?
> [09:41:52.731] connection to server at "localhost" (127.0.0.1), port 5432 
> failed: Connection refused (0x274D/10061)
> [09:41:52.731]Is the server running on that host and accepting TCP/IP 
> connections?
> 
> pg_regress thinks the postmaster is listening to a Unix socket.
> Maybe it's *only* listening to a Unix socket.  In any case,
> psql has very clearly not been told to use a Unix socket.
> Something is not wired up correctly there.

I saw that too, but I don't think it's the primary problem. The server is
listening on ::1 as we know from the log, and quite evidently psql is trying
to connect to that too.  This output also looked this way before the failures.

pg_regress outputs the above message when neither PGHOST, PGPORT or
--host/--port are set. On windows that nevertheless ends up with connecting to
localhost.

Greetings,

Andres Freund

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 09:10:14 -0800, Andres Freund wrote:
> > I find it a little suspicious that startcreate is merrily starting
> > the postmaster on port 5432, without concern for the possibility
> > that that port is in use or blocked by a packet filter.
> 
> It's a container that's just created for that run, so there shouldn't be
> anything else running. We also see that the server successfully starts:
> 
> 2022-03-03 23:24:15.261 GMT [5504][postmaster] LOG:  starting PostgreSQL 
> 15devel, compiled by Visual C++ build 1929, 64-bit
> 2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG:  listening on IPv6 
> address "::1", port 5432
> 2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG:  listening on IPv4 
> address "127.0.0.1", port 5432
> 2022-03-03 23:24:15.321 GMT [5808][startup] LOG:  database system was shut 
> down at 2022-03-03 23:24:15 GMT
> 2022-03-03 23:24:15.341 GMT [5504][postmaster] LOG:  database system is ready 
> to accept connections

Oh, huh. I added a printout of the running tasks, and it sure looks like
postgres is not running anymore when plcheck is run, without anything ending
up in the server log...

It looks like the CI agent had some changes about stdin / stdout of the shells
used to start scripts. The version between the last successful run and the
first failing run changed.
https://github.com/cirruslabs/cirrus-ci-agent/commits/master

I wonder if we're missing some steps, at least on windows, to make pg_ctl
start independent of the starting shell?

Greetings,

Andres Freund

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Tom Lane

Andres Freund  writes:
> On 2022-03-04 09:54:28 -0500, Tom Lane wrote:
>> I find it a little suspicious that startcreate is merrily starting
>> the postmaster on port 5432, without concern for the possibility
>> that that port is in use or blocked by a packet filter.

> It's a container that's just created for that run, so there shouldn't be
> anything else running. We also see that the server successfully starts:

True, so the port was free.  The postmaster wouldn't know whether there's
a relevant packet filter though.  The reason I'm harping on this is that
it's hard to see why this postmaster start would fail before anything
was actually done, when several previous starts on other ports behaved
just fine.

Actually ... looking closer at the failure log:

[09:41:50.589] Checking plpgsql
[09:41:50.608] (using postmaster on Unix socket, default port)
^^^
[09:41:50.608] == dropping database "pl_regression"  
==
[09:41:52.731] psql: error: connection to server at "localhost" (::1), port 
5432 failed: Connection refused (0x274D/10061)
[09:41:52.731]  Is the server running on that host and accepting TCP/IP 
connections?
[09:41:52.731] connection to server at "localhost" (127.0.0.1), port 5432 
failed: Connection refused (0x274D/10061)
[09:41:52.731]  Is the server running on that host and accepting TCP/IP 
connections?

pg_regress thinks the postmaster is listening to a Unix socket.
Maybe it's *only* listening to a Unix socket.  In any case,
psql has very clearly not been told to use a Unix socket.
Something is not wired up correctly there.

regards, tom lane

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Andres Freund

Hi,

On 2022-03-04 09:54:28 -0500, Tom Lane wrote:
> Thomas Munro  writes:
> > https://github.com/postgres/postgres/commits/master
> 
> > I don't see what that patch has to do with the symptoms.
> 
> The buildfarm is not unhappy, so I'd be looking at what changed
> recently in the cfbot's setup.

Very odd.


> I find it a little suspicious that startcreate is merrily starting
> the postmaster on port 5432, without concern for the possibility
> that that port is in use or blocked by a packet filter.

It's a container that's just created for that run, so there shouldn't be
anything else running. We also see that the server successfully starts:

2022-03-03 23:24:15.261 GMT [5504][postmaster] LOG:  starting PostgreSQL 
15devel, compiled by Visual C++ build 1929, 64-bit
2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG:  listening on IPv6 address 
"::1", port 5432
2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2022-03-03 23:24:15.321 GMT [5808][startup] LOG:  database system was shut down 
at 2022-03-03 23:24:15 GMT
2022-03-03 23:24:15.341 GMT [5504][postmaster] LOG:  database system is ready 
to accept connections

Greetings,

Andres Freund

Re: Make unlogged table resets detectable

2022-03-04 Thread Justin Pryzby

Is this patch targetting pg15 ?
There's no discussion since June.

Latest at 2021-06-08 21:29:25 by Jeff Davis 

2022-02-02 16:37:58 Julien Rouhaud (rjuju)  Closed in commitfest 2022-01 
with status: Moved to next CF
2021-12-03 06:18:05 Michael Paquier (michael-kun)   Closed in commitfest 
2021-11 with status: Moved to next CF
2021-10-04 16:32:49 Jaime Casanova (jcasanov)   Closed in commitfest 
2021-09 with status: Moved to next CF
2021-08-03 02:29:40 Masahiko Sawada (masahikosawada)Closed in 
commitfest 2021-07 with status: Moved to next CF

Re: Add LZ4 compression in pg_dump

2022-03-04 Thread Justin Pryzby

The patch is failing on cfbot/freebsd.
http://cfbot.cputube.org/georgios-kokolatos.html

Also, I wondered if you'd looked at the "high compression" interfaces in
lz4hc.h ?  Should pg_dump use that ?

On Fri, Feb 25, 2022 at 08:03:40AM -0600, Justin Pryzby wrote:
> Thanks for working on this.  Your 0001 looks similar to what I did for zstd 
> 1-2
> years ago.
> https://commitfest.postgresql.org/32/2888/
> 
> I rebased and attached the latest patches I had in case they're useful to you.
> I'd like to see zstd included in pg_dump eventually, but it was too much work
> to shepherd the patches.  Now that seems reasonable for pg16.
> 
> With the other compression patches I've worked on, we've used an extra patch
> with changes the default to the new compression algorithm, to force cfbot to
> exercize the new code.
> 
> Do you know the process with commitfests and cfbot ?
> There's also this, which allows running the tests on cirrus before mailing the
> patch to the hackers list.
> ./src/tools/ci/README

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Tom Lane

Thomas Munro  writes:
> On Fri, Mar 4, 2022 at 10:37 PM Bharath Rupireddy
>  wrote:
>> It looks like regression tests are failing on Windows Server 2019 [1].
>> Ignore if it's reported elsewhere.

> It's broken since 46ab07ff "Clean up assorted failures under clang's
> -fsanitize=undefined checks.":

> https://github.com/postgres/postgres/commits/master

> I don't see what that patch has to do with the symptoms.

The buildfarm is not unhappy, so I'd be looking at what changed
recently in the cfbot's setup.

I find it a little suspicious that startcreate is merrily starting
the postmaster on port 5432, without concern for the possibility
that that port is in use or blocked by a packet filter.

regards, tom lane

Re: New Table Access Methods for Multi and Single Inserts

2022-03-04 Thread Matthias van de Meent

On Mon, 19 Apr 2021 at 06:52, Bharath Rupireddy
 wrote:
>
> On Mon, Apr 5, 2021 at 9:49 AM Bharath Rupireddy
>  wrote:
> >
> > On Wed, Mar 10, 2021 at 10:21 AM Bharath Rupireddy
> >  wrote:
> > > Attaching the v4 patch set. Please review it further.
> >
> > Attaching v5 patch set after rebasing onto the latest master.
>
> Another rebase due to conflicts in 0003. Attaching v6 for review.

I recently touched the topic of multi_insert, and I remembered this
patch. I had to dig a bit to find it, but as it's still open I've
added some comments:

> diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> +#define MAX_BUFFERED_TUPLES1000
> +#define MAX_BUFFERED_BYTES65535

It looks like these values were copied over from copyfrom.c, but are
now expressed in the context of the heapam.
As these values are now heap-specific (as opposed to the
TableAM-independent COPY infrastructure), shouldn't we instead
optimize for heap page insertions? That is, I suggest using a multiple
(1 or more) of MaxHeapTuplesPerPage for _TUPLES, and that same / a
similar multiple of BLCKSZ for MAX_BUFFERED_BYTES.

> TableInsertState->flushed
> TableInsertState->mi_slots

I don't quite like the current storage-and-feedback mechanism for
flushed tuples. The current assumptions in this mechanism seem to be
that
1.) access methods always want to flush all available tuples at once,
2.) access methods want to maintain the TupleTableSlots for all
inserted tuples that have not yet had all triggers handled, and
3.) we need access to the not-yet-flushed TupleTableSlots.

I think that that is not a correct set of assumptions; I think that
only flushed tuples need to be visible to the tableam-agnostic code;
and that tableams should be allowed to choose which tuples to flush at
which point; as long as they're all flushed after a final
multi_insert_flush.

Examples:
A heap-based access method might want bin-pack tuples and write out
full pages only; and thus keep some tuples in the buffers as they
didn't fill a page; thus having flushed only a subset of the current
buffered tuples.
A columnstore-based access method might not want to maintain the
TupleTableSlots of original tuples, but instead use temporary columnar
storage: TupleTableSlots are quite large when working with huge
amounts of tuples; and keeping lots of tuple data in memory is
expensive.

As such, I think that this should be replaced with a
TableInsertState->mi_flushed_slots + TableInsertState->mi_flushed_len,
managed by the tableAM, in which only the flushed tuples are visible
to the AM-agnostic code. This is not much different from how the
implementation currently works; except that ->mi_slots now does not
expose unflushed tuples; and that ->flushed is replaced by an integer
value of number of flushed tuples.

A further improvement (in my opinion) would be the change from a
single multi_insert_flush() to a signalling-based multi_insert_flush:
It is not unreasonable for e.g. a columnstore to buffer tens of
thousands of inserts; but doing so in TupleTableSlots would introduce
a high memory usage. Allowing for batched retrieval of flushed tuples
would help in memory usage; which is why multiple calls to
multi_insert_flush() could be useful. To handle this gracefully, we'd
probably add TIS->mi_flush_remaining, where each insert adds one to
mi_flush_remaining; and each time mi_flushed_slots has been handled
mi_flush_remaining is decreased by mi_flushed_len by the handler code.
Once we're done inserting into the table, we keep calling
multi_insert_flush until no more tuples are being flushed (and error
out if we're still waiting for flushes but no new flushed tuples are
returned).

- Matthias

walmethods.c is kind of a mess (was Re: refactoring basebackup.c)

2022-03-04 Thread Robert Haas

On Fri, Mar 4, 2022 at 3:32 AM Dipesh Pandit  wrote:
> GZIP manages to overcome this problem as it provides an option to turn on/off
> compression on the fly while writing a compressed archive with the help of 
> zlib
> library function deflateParams(). The current gzip implementation for
> CreateWalTarMethod uses this library function to turn off compression just 
> before
> step #1 and it writes the uncompressed header of size equal to TAR_BLOCK_SIZE.
> It uses the same library function to turn on the compression for writing the 
> contents
> of the WAL file as part of step #2. It again turns off the compression just 
> before step
> #3 to overwrite the header. The header is overwritten at the same offset with 
> size
> equal to TAR_BLOCK_SIZE.

This is a real mess. To me, it seems like a pretty big hack to use
deflateParams() to shut off compression in the middle of the
compressed data stream so that we can go back and overwrite that part
of the data later. It appears that the only reason we need that hack
is because we don't know the file size starting out. Except we kind of
do know the size, because pad_to_size specifies a minimum size for the
file. It's true that the maximum file size is unbounded, but I'm not
sure why that's important. I wonder if anyone else has an idea why we
didn't just set the file size to pad_to_size exactly when we write the
tar header the first time, instead of this IMHO kind of nutty approach
where we back up. I'd try to figure it out from the comments, but
there basically aren't any. I also had a look at the relevant commit
messages and didn't see anything relevant there either. If I'm missing
something, please point it out.

While I'm complaining, I noticed while looking at this code that it is
documented that "The caller must ensure that only one method is
instantiated in any given program, and that it's only instantiated
once!" As far as I can see, this is because somebody thought about
putting all of the relevant data into a struct and then decided on an
alternative strategy of storing some of it there, and the rest in a
global variable. I can't quite imagine why anyone would think that was
a good idea. There may be some reason that I can't see right now, but
here again there appear to be no relevant code comments.

I'm somewhat inclined to wonder whether we could just get rid of
walmethods.c entirely and use the new bbstreamer stuff instead. That
code also knows how to write plain files into a directory, and write
tar archives, and compress stuff, but in my totally biased opinion as
the author of most of that code, it's better code. It has no
restriction on using at most one method per program, or of
instantiating that method only once, and it already has LZ4 support,
and there's a pending patch for ZSTD support that I intend to get
committed soon as well. It also has, and I know I might be beating a
dead horse here, comments. Now, admittedly, it does need to know the
size of each archive member up front in order to work, so if we can't
solve the problem then we can't go this route. But if we can't solve
that problem, then we also can't add LZ4 and ZSTD support to
walmethods.c, because random access to compressed data is not really a
thing, even if we hacked it to work for gzip.

Thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pg_rewind enhancements

2022-03-04 Thread RKN Sai Krishna

Hi,

While using pg_rewind, I found that it is a bit difficult to use pg_rewind
as it seems to copy even the configuration files and also remove some of
the files created on the old primary which may not be present on the new
primary. Similarly it copies files under the data directory of the new
primary which may not be needed or which possibly could be junk files.

I would propose to have a couple of new command line arguments to
pg_rewind. One, a comma separated list of files which should be preserved
on the old primary, in other words which shouldn't be overwritten from the
new primary. Second, a comma separated list of files which should be
excluded while copying files from the new primary onto the old primary.

Would like to invite more thoughts from the hackers.

Regards,
RKN

Re: Commitfest 2022-03 Patch Triage Part 1b

2022-03-04 Thread Fabien COELHO





Just FYI. Better to follow up to the thread for the patch that's
already in the CF. Otherwise your patch will missed by someone who
looks at the CF entry to see the latest patch.


Indeed. Done.

--
Fabien.

Re: psql - add SHOW_ALL_RESULTS option

2022-03-04 Thread Fabien COELHO





Are you planning to send a rebased patch for this commit fest?


Argh, I did it in a reply in another thread:-( Attached v15.

So as to help moves things forward, I'd suggest that we should not to care 
too much about corner case repetition of some error messages which are due 
to libpq internals, so I could remove the ugly buffer reset from the patch 
and have the repetition, and if/when the issue is fixed later in libpq 
then the repetition will be removed, fine! The issue is that we just 
expose the strange behavior of libpq, which is libpq to solve, not psql.


What do you think?

--
Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index e0abe34bb6..8f7f93172a 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -50,8 +50,28 @@ BEGIN \;
 SELECT 2.0 AS "float" \;
 SELECT 'world' AS "text" \;
 COMMIT;
+ float 
+---
+   2.0
+(1 row)
+
+ text  
+---
+ world
+(1 row)
+
 -- compound with empty statements and spurious leading spacing
 \;\;   SELECT 3 + 3 \;\;\;   SELECT ' ' || ' !' \;\;   SELECT 1 + 4 \;;
+ ?column? 
+--
+6
+(1 row)
+
+ ?column? 
+--
+   !
+(1 row)
+
  ?column? 
 --
 5
@@ -61,6 +81,11 @@ COMMIT;
 SELECT 1 + 1 + 1 AS "add" \gset
 SELECT :add + 1 + 1 AS "add" \;
 SELECT :add + 1 + 1 AS "add" \gset
+ add 
+-
+   5
+(1 row)
+
 -- set operator
 SELECT 1 AS i UNION SELECT 2 ORDER BY i;
  i 
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index caabb06c53..f01adb1fd2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql
commands included in the string to divide it into multiple
transactions.  (See 
for more details about how the server handles multi-query strings.)
-   Also, psql only prints the
-   result of the last SQL command in the string.
-   This is different from the behavior when the same string is read from
-   a file or fed to psql's standard input,
-   because then psql sends
-   each SQL command separately.
   
   
-   Because of this behavior, putting more than one SQL command in a
-   single -c string often has unexpected results.
-   It's better to use repeated -c commands or feed
-   multiple commands to psql's standard input,
+   If having several commands executed in one transaction is not desired, 
+   use repeated -c commands or feed multiple commands to
+   psql's standard input,
either using echo as illustrated above, or
via a shell here-document, for example:
 
@@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3;
 commands included in the string to divide it into multiple
 transactions.  (See 
 for more details about how the server handles multi-query strings.)
-psql prints only the last query result
-it receives for each request; in this example, although all
-three SELECTs are indeed executed, psql
-only prints the 3.
 
 
   
@@ -4160,6 +4149,18 @@ bar
   
 
   
+SHOW_ALL_RESULTS
+
+
+When this variable is set to off, only the last
+result of a combined query (\;) is shown instead of
+all of them.  The default is on.  The off behavior
+is for compatibility with older versions of psql.
+
+
+  
+
+   
 SHOW_CONTEXT
 
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index d65b9a124f..3b2f6305b4 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec);
 static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
+static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec,
+	bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended);
 
 
 /*
@@ -354,7 +356,7 @@ CheckConnection(void)
  * Returns true for valid result, false for error state.
  */
 static bool
-AcceptResult(const PGresult *result)
+AcceptResult(const PGresult *result, bool show_error)
 {
 	bool		OK;
 
@@ -385,7 +387,7 @@ AcceptResult(const PGresult *result)
 break;
 		}
 
-	if (!OK)
+	if (!OK && show_error)
 	{
 		const char *error = PQerrorMessage(pset.db);
 
@@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result)
 	}
 }
 
+/*
+ * Consume all results
+ */
+static void
+ClearOrSaveAllResults()
+{
+	PGresult	*result;
+
+	while ((result = PQgetResult(pset.db)) != NULL)
+		ClearOrSaveResult(result);
+}
+
 
 /*
  * Print microtiming output.  Always print raw milliseconds; if the interval
@@ -573,7 +587,7 @@ PSQLexec(const char *que

Re: wal_compression=zstd

2022-03-04 Thread Andrew Dunstan



On 2/22/22 18:19, Justin Pryzby wrote:
> As writen, this patch uses zstd level=1 (whereas the ZSTD's default compress
> level is 6).


I think this choice needs to be supported by some benchmarks.


>
> 0001 adds support for zstd
> 0002 makes more efficient our use of bits in the WAL header
> 0003 makes it the default compression, to exercise on CI (windows will fail).
> 0004 adds support for zstd-level
> 0005-6 are needed to allow recovery tests to pass when wal compression is 
> enabled;
> 0007 (not included) also adds support for zlib.  I'm of the impression nobody
>  cares about this, otherwise it would've been included 5-10 years ago.
>
> Only 0001 should be reviewed for pg15 - the others are optional/future work.



I don't see why patch 5 shouldn't be applied forthwith.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: wal_compression=zstd

2022-03-04 Thread Robert Haas

On Fri, Mar 4, 2022 at 6:44 AM Justin Pryzby  wrote:
> > Why?  ZSTD using this default has its reasons, no?  And it would be
> > consistent to do the same for ZSTD as for the other two methods.
>
> In my 1-off test, it gets 610/633 = 96% of the benefit at 209/273 = 77% of the
> cost.

I agree with Michael. Your 1-off test is exactly that, and the results
will have depended on the data you used for the test. I'm not saying
we could never decide to default to a compression level other than the
library's default, but I do not think we should do it casually or as
the result of any small number of tests. There should be a strong
presumption that the authors of the library have a good idea what is
sensible in general unless we can explain compellingly why our use
case is different from typical ones.

There's an ease-of-use concern here too. It's not going to make things
any easier for users to grok if zstd is available in different parts
of the system but has different defaults in each place. It wouldn't be
the end of the world if that happened, but neither would it be ideal.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [Proposal] vacuumdb --schema only

2022-03-04 Thread Gilles Darold


Le 04/03/2022 à 11:56, Justin Pryzby a écrit :

On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:

The attached patch implements that. Option -n | --schema can be used
multiple time and can not be used together with options -a or -t.

Yes, thanks.

I suggest there should also be an --exclude-schema.



Ok, I will add it too.





I do not propose to extend the VACUUM and ANALYZE commands because their
current syntax doesn't allow me to see an easy way to do that

I think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.



Yes this is what I've though, something a la EXPLAIN, for example : 
"VACUUM (ANALYZE, SCHEMA foo)" but this is a change in the VACUUM syntax 
that needs to keep the compatibility with the current syntax. We will 
have two syntax something like "VACUUM ANALYZE FULL dbname" and "VACUUM 
(ANALYZE, FULL) dbname". The other syntax "problem" is to be able to use 
multiple schema values in the VACUUM command, perhaps "VACUUM (ANALYZE, 
SCHEMA (foo,bar))".




+   /*
+* When filtereing on schema name, filter by table is not allowed.
+* The schema name can already be set in a fqdn table name.

set *to*


Thanks, will be fixed in next patch version.


--
Gilles Darold

Re: Add the replication origin name and commit-LSN to logical replication worker errcontext

2022-03-04 Thread Euler Taveira

On Fri, Mar 4, 2022, at 2:54 AM, Amit Kapila wrote:
> The LSN of the transaction that contains the change violating the
> constraint and the replication origin name can be found from the
> server log (LSN 0/14C0378 and replication origin
> pg_16395 in the above case).  To skip the
> transaction, the subscription needs to be disabled temporarily by
> ALTER SUBSCRIPTION ... DISABLE first. Then, the
> transaction can be skipped by calling the  linkend="pg-replication-origin-advance">
> pg_replication_origin_advance() function
> with the node_name (i.e.,
> pg_16395) and the next LSN of the commit LSN (i.e.,
> LSN 0/14C0379).
You could also add:

After that the replication can be resumed by ALTER SUBSCRIPTION ...
ENABLE.

Let's provide complete instructions.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-04 Thread Ashutosh Sharma

Thanks Bharath for working on all my review comments. I took a quick
look at the new version of the patch (v7-pg_walinspect.patch) and this
version looks a lot better. I'll do some detailed review later (maybe
next week or so) and let you know my further comments, if any.

--
With Regards,
Ashutosh Sharma.

On Fri, Mar 4, 2022 at 3:54 PM Bharath Rupireddy
 wrote:
>
> On Thu, Mar 3, 2022 at 10:05 PM Robert Haas  wrote:
> >
> > On Fri, Feb 25, 2022 at 6:03 AM Bharath Rupireddy
> >  wrote:
> > > Added a new function that returns the first and last valid WAL record
> > > LSN of a given WAL file.
> >
> > Sounds like fuzzy thinking to me. WAL records can cross file
> > boundaries, and forgetting about that leads to all sorts of problems.
> > Just give people one function that decodes a range of LSNs and call it
> > good. Why do you need anything else? If people want to get the first
> > record that begins in a segment or the first record any portion of
> > which is in a particular segment or the last record that begins in a
> > segment or the last record that ends in a segment or any other such
> > thing, they can use a WHERE clause for that... and if you think they
> > can't, then that should be good cause to rethink the return value of
> > the one-and-only SRF that I think you need here.
>
> Thanks Robert.
>
> Thanks to others for your review comments.
>
> Here's the v7 patch set. These patches are based on the motive that
> "keep it simple and short yet effective and useful". With that in
> mind, I have not implemented the wait mode for any of the functions
> (as it doesn't look an effective use-case and requires adding a new
> page_read callback, instead throw error if future LSN is specified),
> also these functions will give WAL information on the current server's
> timeline. Having said that, I'm open to adding new functions in future
> once this initial version gets in, if there's a requirement and users
> ask for the new functions.
>
> Please review the v7 patch set and provide your thoughts.
>
> Regards,
> Bharath Rupireddy.

Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

2022-03-04 Thread Ashutosh Sharma

Please don't mix comments from multiple reviewers into one thread.
It's hard to understand which comments are mine or Julien's or from
others. Can you please respond to the email from each of us separately
with an inline response. That will be helpful to understand your
thoughts on our review comments.

--
With Regards,
Ashutosh Sharma.

On Fri, Mar 4, 2022 at 4:59 PM Nitin Jadhav
 wrote:
>
> Thanks for reviewing.
>
> > 6) How about shutdown and end-of-recovery checkpoint? Are you planning
> > to have an ereport_startup_progress mechanism as 0002?
>
> I thought of including it earlier then I felt lets first make the
> current patch stable. Once all the fields are properly decided and the
> patch gets in then we can easily extend the functionality to shutdown
> and end-of-recovery cases. I have also observed that the timer
> functionality wont work properly in case of shutdown as we are doing
> an immediate checkpoint. So this needs a lot of discussion and I would
> like to handle this on a separate thread.
> ---
>
> > 7) I think you don't need to call checkpoint_progress_start and
> > pgstat_progress_update_param, any other progress reporting function
> > for shutdown and end-of-recovery checkpoint right?
>
> I had included the guards earlier and then removed later based on the
> discussion upthread.
> ---
>
> > [local]:5432 ashu@postgres=# select * from pg_stat_progress_checkpoint;
> > -[ RECORD 1 ]-+-
> > pid   | 22043
> > type  | checkpoint
> > kind  | immediate force wait requested time
> > start_lsn | 0/14C60F8
> > start_time| 2022-03-03 18:59:56.018662+05:30
> > phase | performing two phase checkpoint
> >
> >
> > This is the output I see when the checkpointer process has come out of
> > the two phase checkpoint and is currently writing checkpoint xlog
> > records and doing other stuff like updating control files etc. Is this
> > okay?
>
> The idea behind choosing the phases is based on the functionality
> which takes longer time to execute. Since after two phase checkpoint
> till post checkpoint cleanup won't take much time to execute, I have
> not added any additional phase for that. But I also agree that this
> gives wrong information to the user. How about mentioning the phase
> information at the end of each phase like "Initializing",
> "Initialization done", ..., "two phase checkpoint done", "post
> checkpoint cleanup done", .., "finalizing". Except for the first phase
> ("initializing") and last phase ("finalizing"), all the other phases
> describe the end of a certain operation. I feel this gives correct
> information even though the phase name/description does not represent
> the entire code block between two phases. For example if the current
> phase is ''two phase checkpoint done". Then the user can infer that
> the checkpointer has done till two phase checkpoint and it is doing
> other stuff that are after that. Thoughts?
>
> > The output of log_checkpoint shows the number of buffers written is 3
> > whereas the output of pg_stat_progress_checkpoint shows it as 0. See
> > below:
> >
> > 2022-03-03 20:04:45.643 IST [22043] LOG:  checkpoint complete: wrote 3
> > buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled;
> > write=24.652 s, sync=104.256 s, total=3889.625 s; sync files=2,
> > longest=0.011 s, average=0.008 s; distance=0 kB, estimate=0 kB
> >
> > --
> >
> > [local]:5432 ashu@postgres=# select * from pg_stat_progress_checkpoint;
> > -[ RECORD 1 ]-+-
> > pid   | 22043
> > type  | checkpoint
> > kind  | immediate force wait requested time
> > start_lsn | 0/14C60F8
> > start_time| 2022-03-03 18:59:56.018662+05:30
> > phase | finalizing
> > buffers_total | 0
> > buffers_processed | 0
> > buffers_written   | 0
> >
> > Any idea why this mismatch?
>
> Good catch. In BufferSync() we have 'num_to_scan' (buffers_total)
> which indicates the total number of buffers to be processed. Based on
> that, the 'buffers_processed' and 'buffers_written' counter gets
> incremented. I meant these values may reach upto 'buffers_total'. The
> current pg_stat_progress_view support above information. There is
> another place when 'ckpt_bufs_written' gets incremented (In
> SlruInternalWritePage()). This increment is above the 'buffers_total'
> value and it is included in the server log message (checkpoint end)
> and not included in the view. I am a bit confused here. If we include
> this increment in the view then we cannot calculate the exact
> 'buffers_total' beforehand. Can we increment the 'buffers_toal' also
> when 'ckpt_bufs_written' gets incremented so that we can match the
> behaviour with checkpoint end message?  Please share your thoughts.
> ---
>
> > I think we can add a couple of more information to this view -
> > start_time for buffer write operation and start_time for buffer sync
>

Re: Add the replication origin name and commit-LSN to logical replication worker errcontext

2022-03-04 Thread Masahiko Sawada

On Fri, Mar 4, 2022 at 2:55 PM Amit Kapila  wrote:
>
> On Fri, Mar 4, 2022 at 6:40 AM Masahiko Sawada  wrote:
> >
> > Attached updated version patches.
> >
>
> The patch looks mostly good to me. Few minor comments:

Thank you for the comments!

> 1. I think we can have an Assert for errarg->origin_name in
> apply_error_callback after checking the command as this function
> assumes that it will always be set.

Added.

> 2. I suggest minor changes in the documentation change:
> When a conflict produces an error, the replication won't proceed, and
> the apply worker will emit the following kind of message to the
> subscriber's server log:
> +
> +ERROR:  duplicate key value violates unique constraint "test_pkey"
> +DETAIL:  Key (c)=(1) already exists.
> +CONTEXT:  processing remote data during "INSERT" for replication
> target relation "public.test" in transaction 725 committed at LSN
> 0/14BFA88 from replication origin "pg_16395"
> +
>
> The LSN of the transaction that contains the change violating the
> constraint and the replication origin name can be found from the
> server log (LSN 0/14C0378 and replication origin
> pg_16395 in the above case).  To skip the
> transaction, the subscription needs to be disabled temporarily by
> ALTER SUBSCRIPTION ... DISABLE first. Then, the
> transaction can be skipped by calling the  linkend="pg-replication-origin-advance">
> pg_replication_origin_advance() function
> with the node_name (i.e.,
> pg_16395) and the next LSN of the commit LSN (i.e.,
> LSN 0/14C0379).

Fixed.

On Fri, Mar 4, 2022 at 3:21 PM Amit Kapila  wrote:
>
> On Fri, Mar 4, 2022 at 11:45 AM osumi.takami...@fujitsu.com
>  wrote:
> >
> > On Friday, March 4, 2022 2:23 PM Masahiko Sawada  
> > wrote:
> > > I've attached updated patches.
> > Hi, thank you for updating the patch.
> >
> > One comment on v4.
> >
> > In v4-0002, we introduce 'commit_lsn' in the ApplyErrorCallbackArg.
> > This member is set for prepare, rollback prepared and stream_abort as well.
> > The new log message format is useful when we have a prepare transaction
> > that keeps failing on the subscriber and want to know the target transaction
> > for the pg_replication_origin_advance(), right ?
> > If this is true, I wasn't sure if the name 'commit_lsn' is the
> > most accurate name for this variable. Should we adjust the name a bit ?
> >
>
> If we want to change this variable name, the other options could be
> 'end_lsn', or 'finish_lsn'.
>

Agreed with 'finish_lsn'.

I've attached updated patches. Please review them.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v4-0002-Add-the-origin-name-and-remote-commit-LSN-to-logi.patch
Description: Binary data


v4-0001-Use-complete-sentences-in-logical-replication-wor.patch
Description: Binary data

Re: wal_compression=zstd

2022-03-04 Thread Justin Pryzby

On Fri, Mar 04, 2022 at 04:19:32PM +0900, Michael Paquier wrote:
> On Tue, Feb 22, 2022 at 05:19:48PM -0600, Justin Pryzby wrote:
> 
> > As writen, this patch uses zstd level=1 (whereas the ZSTD's default compress
> > level is 6).
> 
> Why?  ZSTD using this default has its reasons, no?  And it would be
> consistent to do the same for ZSTD as for the other two methods.

In my 1-off test, it gets 610/633 = 96% of the benefit at 209/273 = 77% of the
cost.

postgres=# SET wal_compression= 'zstd-6';
postgres=# \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; 
SELECT pg_stat_reset_shared('wal'); begin; CREATE INDEX ON t(a); rollback; 
SELECT * FROM pg_stat_wal;
Duración: 2729,017 ms (00:02,729)
wal_bytes| 6102403

postgres=# SET wal_compression= 'zstd-1';
postgres=# \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; 
SELECT pg_stat_reset_shared('wal'); begin; CREATE INDEX ON t(a); rollback; 
SELECT * FROM pg_stat_wal;
Duración: 2090,459 ms (00:02,090)
wal_bytes| 6330269

It may be relevant that we're only compressing 8k [0].
It would probably pay off to preserve a compression "dictionary" to be re-used
across FPIs.

> -The supported methods are pglz and
> -lz4 (if PostgreSQL was
> -compiled with --with-lz4). The default value is
> -off. Only superusers can change this setting.
> +The supported methods are pglz, and
> +(if configured when PostgreSQLwas built)
> +lz4 and zstd.
> +The default value is off.
> +Only superusers can change this setting.
> 
> This is making the docs less verbose.  I think that you should keep
> the mention to respectively --with-lz4 and --with-zstd after each
> value.

I changed this relative to your latest zstd patch in July since it reads better
to me.  YMMV.

On Tue, Feb 22, 2022 at 05:19:48PM -0600, Justin Pryzby wrote:
>> 0001 adds support for zstd
>> 0002 makes more efficient our use of bits in the WAL header
>> 0003 makes it the default compression, to exercise on CI (windows will fail).
>> 0004 adds support for zstd-level
>> 0005-6 are needed to allow recovery tests to pass when wal compression is 
>> enabled;
>> 0007 (not included) also adds support for zlib.  I'm of the impression nobody
>>  cares about this, otherwise it would've been included 5-10 years ago.

> 0003, defaulting to zstd, would require a lot of benchmarking to be
> justified.  

Maybe you didn't see that I wrote that it was for CI ?
(In any case, it's impossible to do that without first taking care of 005-6).

> and 0004 to extend compression to support a level 
> I have argued against making the code more complicated for such things in the
> past, with reloptions.

I suppose that you're referring to reloptions for toast compression, which was
about controlling LZ4 compression level.

https://www.postgresql.org/message-id/flat/cafitn-vmhu_yakmgno90suih_6ltvmqz5hpqb2itgqtvyoj...@mail.gmail.com

I think you're right that it's not interesting to control the compression level
of LZ4 - but there's no reason to say the same thing of zstd.  We're already
debating which is the "right" level to use (1 or 6).  I don't think there is a
"right" level - some people will want to trade more CPU for disk writes and
some people won't.

Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

2022-03-04 Thread Nitin Jadhav

Thanks for reviewing.

> 6) How about shutdown and end-of-recovery checkpoint? Are you planning
> to have an ereport_startup_progress mechanism as 0002?

I thought of including it earlier then I felt lets first make the
current patch stable. Once all the fields are properly decided and the
patch gets in then we can easily extend the functionality to shutdown
and end-of-recovery cases. I have also observed that the timer
functionality wont work properly in case of shutdown as we are doing
an immediate checkpoint. So this needs a lot of discussion and I would
like to handle this on a separate thread.
---

> 7) I think you don't need to call checkpoint_progress_start and
> pgstat_progress_update_param, any other progress reporting function
> for shutdown and end-of-recovery checkpoint right?

I had included the guards earlier and then removed later based on the
discussion upthread.
---

> [local]:5432 ashu@postgres=# select * from pg_stat_progress_checkpoint;
> -[ RECORD 1 ]-+-
> pid   | 22043
> type  | checkpoint
> kind  | immediate force wait requested time
> start_lsn | 0/14C60F8
> start_time| 2022-03-03 18:59:56.018662+05:30
> phase | performing two phase checkpoint
>
>
> This is the output I see when the checkpointer process has come out of
> the two phase checkpoint and is currently writing checkpoint xlog
> records and doing other stuff like updating control files etc. Is this
> okay?

The idea behind choosing the phases is based on the functionality
which takes longer time to execute. Since after two phase checkpoint
till post checkpoint cleanup won't take much time to execute, I have
not added any additional phase for that. But I also agree that this
gives wrong information to the user. How about mentioning the phase
information at the end of each phase like "Initializing",
"Initialization done", ..., "two phase checkpoint done", "post
checkpoint cleanup done", .., "finalizing". Except for the first phase
("initializing") and last phase ("finalizing"), all the other phases
describe the end of a certain operation. I feel this gives correct
information even though the phase name/description does not represent
the entire code block between two phases. For example if the current
phase is ''two phase checkpoint done". Then the user can infer that
the checkpointer has done till two phase checkpoint and it is doing
other stuff that are after that. Thoughts?

> The output of log_checkpoint shows the number of buffers written is 3
> whereas the output of pg_stat_progress_checkpoint shows it as 0. See
> below:
>
> 2022-03-03 20:04:45.643 IST [22043] LOG:  checkpoint complete: wrote 3
> buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled;
> write=24.652 s, sync=104.256 s, total=3889.625 s; sync files=2,
> longest=0.011 s, average=0.008 s; distance=0 kB, estimate=0 kB
>
> --
>
> [local]:5432 ashu@postgres=# select * from pg_stat_progress_checkpoint;
> -[ RECORD 1 ]-+-
> pid   | 22043
> type  | checkpoint
> kind  | immediate force wait requested time
> start_lsn | 0/14C60F8
> start_time| 2022-03-03 18:59:56.018662+05:30
> phase | finalizing
> buffers_total | 0
> buffers_processed | 0
> buffers_written   | 0
>
> Any idea why this mismatch?

Good catch. In BufferSync() we have 'num_to_scan' (buffers_total)
which indicates the total number of buffers to be processed. Based on
that, the 'buffers_processed' and 'buffers_written' counter gets
incremented. I meant these values may reach upto 'buffers_total'. The
current pg_stat_progress_view support above information. There is
another place when 'ckpt_bufs_written' gets incremented (In
SlruInternalWritePage()). This increment is above the 'buffers_total'
value and it is included in the server log message (checkpoint end)
and not included in the view. I am a bit confused here. If we include
this increment in the view then we cannot calculate the exact
'buffers_total' beforehand. Can we increment the 'buffers_toal' also
when 'ckpt_bufs_written' gets incremented so that we can match the
behaviour with checkpoint end message?  Please share your thoughts.
---

> I think we can add a couple of more information to this view -
> start_time for buffer write operation and start_time for buffer sync
> operation. These are two very time consuming tasks in a checkpoint and
> people would find it useful to know how much time is being taken by
> the checkpoint in I/O operation phase. thoughts?

I felt the detailed progress is getting shown for these 2 phases of
the checkpoint like 'buffers_processed', 'buffers_written' and
'files_synced'. Hence I did not think about adding start time and If
it is really required, then I can add.

> Is it that hard or costly to do?  Just sending a message to increment
> the stat counter in RequestCheckpoint() would be enough.
>
> Also, u

Re: [Proposal] vacuumdb --schema only

2022-03-04 Thread Dinesh Chemuduru

On Fri, 4 Mar 2022 at 14:41, Gilles Darold  wrote:

> Hi,
>
>
> When we want to vacuum and/or analyze all tables in a dedicated schema,
> let's say pg_catalog for example, there is no easy way to do that. The
> VACUUM command doesn't allow it so we have to use \gexec or a SQL script
> to do that. We have an external command vacuumdb that could be used to
> simplify this task. For example the following command can be used to
> clean all tables stored in the pg_catalog schema:
>
>  vacuumdb --schema pg_catalog -d foo
>
>
+1
This gives much better flexibility to users.




> The attached patch implements that. Option -n | --schema can be used
> multiple time and can not be used together with options -a or -t.
>
>
> Common use cases are an application that creates lot of temporary
> objects then drop them which can bloat a lot the catalog or which have
> heavy work in some schemas only. Of course the good practice is to find
> the bloated tables and execute VACUUM on each table but if most of the
> tables in the schema are regularly bloated the use of the vacuumdb
> --schema script can save time.
>
>
> I do not propose to extend the VACUUM and ANALYZE commands because their
> current syntax doesn't allow me to see an easy way to do that and also
> because I'm not really in favor of such change. But if there is interest
> in improving these commands I will be pleased to do that, with the
> syntax suggested.
>
>
> Best regards,
>
> --
> Gilles Darold
>

Re: [Proposal] vacuumdb --schema only

2022-03-04 Thread Justin Pryzby

On Fri, Mar 04, 2022 at 10:11:28AM +0100, Gilles Darold wrote:
> The attached patch implements that. Option -n | --schema can be used
> multiple time and can not be used together with options -a or -t.

Yes, thanks.

I suggest there should also be an --exclude-schema.

> I do not propose to extend the VACUUM and ANALYZE commands because their
> current syntax doesn't allow me to see an easy way to do that

I think this would be easy with the parenthesized syntax.
I'm not suggesting to do it there, though.

> + /*
> +  * When filtereing on schema name, filter by table is not allowed.
> +  * The schema name can already be set in a fqdn table name.

set *to*

-- 
Justin

Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Thomas Munro

On Fri, Mar 4, 2022 at 10:37 PM Bharath Rupireddy
 wrote:
> It looks like regression tests are failing on Windows Server 2019 [1].
> Ignore if it's reported elsewhere.

It's broken since 46ab07ff "Clean up assorted failures under clang's
-fsanitize=undefined checks.":

https://github.com/postgres/postgres/commits/master

I don't see what that patch has to do with the symptoms.  It looks a
bit like the cluster started by the "startcreate" step ("pg_ctl.exe
start ...") is mysteriously no longer running when we get to the
"test_pl" step ("Connection refused").

Re: Removing unneeded self joins

2022-03-04 Thread Andrey Lepikhov


On 1/3/2022 03:03, Greg Stark wrote:

On Thu, 1 Jul 2021 at 02:38, Ronan Dunklau  wrote:


Well in some cases they can't, when the query is not emitting redundant
predicates by itself but they are added by something else like a view or a RLS
policy.
Maybe it would be worth it to allow spending a bit more time planning for
those cases ?


Yeah, I'm generally in favour of doing more work in the optimizer to
save query authors work writing queries.

My question is whether it handles cases like:

select b.x,c.y
   from t
join t2 as b on (b.id = t.id)
   join t2 as c on (c.id = t.id)

That is, if you join against the same table twice on the same qual.
Does the EC mechanism turn this into a qual on b.id = c.id and then
turn this into a self-join that can be removed?
Yes, the self-join removal machinery uses EC mechanism as usual to get 
all join clauses. So, this case works (See demo in attachment).


Also, in new version of the patch I fixed one stupid bug: checking a 
self-join candidate expression operator - we can remove only expressions 
like F(arg1) = G(arg2).


--
regards,
Andrey Lepikhov
Postgres ProfessionalFrom 70398361a0a0d9c6c3c7ddd1fd305ac11138e7b1 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Thu, 15 Jul 2021 15:26:13 +0300
Subject: [PATCH] Remove self-joins.

Remove inner joins of a relation to itself if could prove that the join
can be replaced with a scan. We can prove the uniqueness
using the existing innerrel_is_unique machinery.

We can remove a self-join when for each outer row:
1. At most one inner row matches the join clauses.
2. If the join target list contains any inner vars, an inner row
must be (physically) the same row as the outer one.

In this patch we use Rowley's [1] approach to identify a self-join:
1. Collect all mergejoinable join quals which look like a.x = b.x
2. Check innerrel_is_unique() for the qual list from (1). If it
returns true, then outer row matches only the same row from the inner
relation. So proved, that this join is self-join and can be replaced by
a scan.

Some regression tests changed due to self-join removal logic.

[1] 
https://www.postgresql.org/message-id/raw/CAApHDvpggnFMC4yP-jUO7PKN%3DfXeErW5bOxisvJ0HvkHQEY%3DWw%40mail.gmail.com
---
 src/backend/optimizer/plan/analyzejoins.c | 888 +-
 src/backend/optimizer/plan/planmain.c |   5 +
 src/backend/optimizer/util/joininfo.c |   3 +
 src/backend/optimizer/util/relnode.c  |  26 +-
 src/backend/utils/misc/guc.c  |  10 +
 src/include/optimizer/pathnode.h  |   4 +
 src/include/optimizer/planmain.h  |   2 +
 src/test/regress/expected/equivclass.out  |  32 +
 src/test/regress/expected/join.out| 426 +++
 src/test/regress/expected/sysviews.out|   3 +-
 src/test/regress/sql/equivclass.sql   |  16 +
 src/test/regress/sql/join.sql | 197 +
 12 files changed, 1583 insertions(+), 29 deletions(-)

diff --git a/src/backend/optimizer/plan/analyzejoins.c 
b/src/backend/optimizer/plan/analyzejoins.c
index 337f470d58..c5ac8e2bd4 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,6 +22,7 @@
  */
 #include "postgres.h"
 
+#include "catalog/pg_class.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
@@ -32,10 +33,12 @@
 #include "optimizer/tlist.h"
 #include "utils/lsyscache.h"
 
+bool enable_self_join_removal;
+
 /* local functions */
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
- Relids 
joinrelids);
+ Relids 
joinrelids, int subst_relid);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int 
*nremoved);
 static bool rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel);
 static bool rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel,
@@ -47,6 +50,9 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
   RelOptInfo 
*innerrel,
   JoinType 
jointype,
   List 
*restrictlist);
+static void change_rinfo(RestrictInfo* rinfo, Index from, Index to);
+static Bitmapset* change_relid(Relids relids, Index oldId, Index newId);
+static void change_varno(Expr *expr, Index oldRelid, Index newRelid);
 
 
 /*
@@ -86,7 +92,7 @@ restart:
 
remove_rel_from_query(root, innerrelid,
  
bms_union(sjinfo->min_lefthand,
-   
sjinfo->min_righthand));
+   
sjinfo->min_rightha

Re: Column Filtering in Logical Replication

2022-03-04 Thread Amit Kapila

On Wed, Mar 2, 2022 at 5:43 PM Tomas Vondra
 wrote:
>
> Attached is an updated patch, addressing most of the issues reported so
> far. There are various minor tweaks, but the main changes are:
...
>
> 3) checks of column filter vs. publish_via_partition_root and replica
> identity, following the same logic as the row-filter patch (hopefully,
> it touches the same places, using the same logic, ...)
>
> That means - with "publish_via_partition_root=false" it's not allowed to
> specify column filters on partitioned tables, only for leaf partitions.
>
> And we check column filter vs. replica identity when adding tables to
> publications, or whenever we change the replica identity.
>

This handling is different from row filter work and I see problems
with it. The column list validation w.r.t primary key (default replica
identity) is missing. The handling of column list vs. partitions has
multiple problems: (a) In attach partition, the patch is just checking
ancestors for RI validation but what if the table being attached has
further subpartitions; (b) I think the current locking also seems to
have problems because it is quite possible that while it validates the
ancestors here, concurrently someone changes the column list. I think
it won't be enough to just change the locking mode because with the
current patch strategy during attach, we will be first taking locks
for child tables of current partition and then parent tables which can
pose deadlock hazards.

The columns list validation also needs to be done when we change
publication action.

There could be more similar problems which I might have missed. For
some of these (except for concurrency issues), my colleague Shi-San
has done testing and the results are below [1]. I feel we should do RI
vs. column list handling similar to row filter work (at one place) to
avoid all such hazards and possibly similar handling at various
places, there is a good chance that we will miss some places or make
mistakes that are not easy to catch. Do let me know if you think it
makes sense for me or one of the people who work on row filter patch
to try this (make the handling of RI checks similar to row filter
work) and then we can see if that turns out to be a simple way to deal
with all these problems?

Some other miscellaneous comments:
=
*
In get_rel_sync_entry(), the handling for partitioned tables doesn't
seem to be correct. It can publish a different set of columns based on
the order of publications specified in the subscription.

For example:

create table parent (a int, b int, c int) partition by range (a);
create table test_part1 (like parent);
alter table parent attach partition test_part1 for values from (1) to (10);

create publication pub for table parent(a) with (PUBLISH_VIA_PARTITION_ROOT);
create publication pub2 for table test_part1(b);
---

Now, depending on the order of publications in the list while defining
subscription, the column list will change

create subscription sub connection 'port=1 dbname=postgres'
publication pub, pub2;

For the above, column list will be: (a)

create subscription sub connection 'port=1 dbname=postgres'
publication pub2, pub;

For this one, the column list will be: (a, b)

To avoid this, the column list should be computed based on the final
publish_as_relid as we are doing for the row filter.

*
Fetching column filter info in tablesync.c is quite expensive. It
seems to be using four round-trips to get the complete info whereas
for row-filter we use just one round trip. I think we should try to
get both row filter and column filter info in just one round trip.

[1] -
Test-1:
The patch doesn't check when the primary key changes.

e.g.
-- publisher --
create table tbl(a int primary key, b int);
create publication pub for table tbl(a);
alter table tbl drop CONSTRAINT tbl_pkey;
alter table tbl add primary key (b);
insert into tbl values (1,1);

-- subscriber --
create table tbl(a int, b int);
create subscription sub connection 'port=5432 dbname=postgres' publication pub;
update tbl set b=1 where a=1;
alter table tbl add primary key (b);

-- publisher --
delete from tbl;

Column "b" is part of replica identity, but it is filtered, which
caused an error on the subscriber side.

ERROR:  publisher did not send replica identity column expected by the
logical replication target relation "public.tbl"
CONTEXT:  processing remote data during "DELETE" for replication
target relation "public.tbl" in transaction 724 at 2022-03-04
11:46:16.330892+08

Test-2: Partitioned table RI w.r.t column list.
2.1
Using "create table ... partition of".

e.g.
-- publisher --
create table parent (a int, b int) partition by range (a);
create publication pub for table parent(a)
with(publish_via_partition_root=true);
create table child partition of parent (primary key (a,b)) default;
insert into parent values (1,1);

-- subscriber --
create table parent (a int, b int) partition by range (a);
create table child partit

Re: pg_tablespace_location() failure with allow_in_place_tablespaces

2022-03-04 Thread Michael Paquier

On Fri, Mar 04, 2022 at 06:04:00PM +0900, Kyotaro Horiguchi wrote:
> And I made a quick hack on do_pg_start_backup.  And I found that
> pg_basebackup copies in-place tablespaces under the *current
> directory*, which is not ok at all:(

Yeah, I have noticed that as well while testing such configurations a
couple of hours ago, but I am not sure yet how much we need to care
about that as in-place tablespaces are included in the main data
directory anyway, which would be fine for most test purposes we
usually care about.  Perhaps this has an impact on the patch posted on
the thread that wants to improve the guarantees around tablespace
directory structures, but I have not studied this thread much to have
an opinion.  And it is Friday.
--
Michael


signature.asc
Description: PGP signature

Re: pg_tablespace_location() failure with allow_in_place_tablespaces

2022-03-04 Thread Thomas Munro

On Fri, Mar 4, 2022 at 10:04 PM Kyotaro Horiguchi
 wrote:
> And I made a quick hack on do_pg_start_backup.  And I found that
> pg_basebackup copies in-place tablespaces under the *current
> directory*, which is not ok at all:(

Hmm.  Which OS are you on?  Looks OK here -- the "in place" tablespace
gets copied as a directory under pg_tblspc, no symlink:

postgres=# set allow_in_place_tablespaces = on;
SET
postgres=# create tablespace ts1 location '';
CREATE TABLESPACE
postgres=# create table t (i int) tablespace ts1;
CREATE TABLE
postgres=# insert into t values (1), (2);
INSERT 0 2
postgres=# create user replication replication;
CREATE ROLE

$ pg_basebackup --user replication  -D pgdata2
$ ls -slaph pgdata/pg_tblspc/
total 4.0K
   0 drwx--  3 tmunro tmunro   19 Mar  4 23:16 ./
4.0K drwx-- 19 tmunro tmunro 4.0K Mar  4 23:16 ../
   0 drwx--  3 tmunro tmunro   29 Mar  4 23:16 16384/
$ ls -slaph pgdata2/pg_tblspc/
total 4.0K
   0 drwx--  3 tmunro tmunro   19 Mar  4 23:16 ./
4.0K drwx-- 19 tmunro tmunro 4.0K Mar  4 23:16 ../
   0 drwx--  3 tmunro tmunro   29 Mar  4 23:16 16384/
$ ls -slaph pgdata/pg_tblspc/16384/PG_15_202203031/5/
total 8.0K
   0 drwx-- 2 tmunro tmunro   19 Mar  4 23:16 ./
   0 drwx-- 3 tmunro tmunro   15 Mar  4 23:16 ../
8.0K -rw--- 1 tmunro tmunro 8.0K Mar  4 23:16 16385
$ ls -slaph pgdata2/pg_tblspc/16384/PG_15_202203031/5/
total 8.0K
   0 drwx-- 2 tmunro tmunro   19 Mar  4 23:16 ./
   0 drwx-- 3 tmunro tmunro   15 Mar  4 23:16 ../
8.0K -rw--- 1 tmunro tmunro 8.0K Mar  4 23:16 16385

The warning from readlink() while making the mapping file isn't ideal,
and perhaps we should suppress that with something like the attached.
Or does the missing map file entry break something on Windows?
From 6f001ec46c5e5f6ffa8e103f2b0d711e6904b763 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 4 Mar 2022 23:07:46 +1300
Subject: [PATCH] Fix warning in basebackup of in-place tablespaces.

While building the tablespace map for a base backup, don't call
readlink() on directories under pg_tblspc.
---
 src/backend/access/transam/xlog.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0d2bd7a357..b92cc66afb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -66,6 +66,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "common/controldata_utils.h"
+#include "common/file_utils.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
@@ -8292,6 +8293,10 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
 
 			snprintf(fullpath, sizeof(fullpath), "pg_tblspc/%s", de->d_name);
 
+			/* Skip in-place tablespaces (testing use only) */
+			if (get_dirent_type(fullpath, de, false, ERROR) == PGFILETYPE_DIR)
+continue;
+
 #if defined(HAVE_READLINK) || defined(WIN32)
 			rllen = readlink(fullpath, linkpath, sizeof(linkpath));
 			if (rllen < 0)
-- 
2.30.2

Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-04 Thread Bharath Rupireddy

On Thu, Mar 3, 2022 at 10:05 PM Robert Haas  wrote:
>
> On Fri, Feb 25, 2022 at 6:03 AM Bharath Rupireddy
>  wrote:
> > Added a new function that returns the first and last valid WAL record
> > LSN of a given WAL file.
>
> Sounds like fuzzy thinking to me. WAL records can cross file
> boundaries, and forgetting about that leads to all sorts of problems.
> Just give people one function that decodes a range of LSNs and call it
> good. Why do you need anything else? If people want to get the first
> record that begins in a segment or the first record any portion of
> which is in a particular segment or the last record that begins in a
> segment or the last record that ends in a segment or any other such
> thing, they can use a WHERE clause for that... and if you think they
> can't, then that should be good cause to rethink the return value of
> the one-and-only SRF that I think you need here.

Thanks Robert.

Thanks to others for your review comments.

Here's the v7 patch set. These patches are based on the motive that
"keep it simple and short yet effective and useful". With that in
mind, I have not implemented the wait mode for any of the functions
(as it doesn't look an effective use-case and requires adding a new
page_read callback, instead throw error if future LSN is specified),
also these functions will give WAL information on the current server's
timeline. Having said that, I'm open to adding new functions in future
once this initial version gets in, if there's a requirement and users
ask for the new functions.

Please review the v7 patch set and provide your thoughts.

Regards,
Bharath Rupireddy.
From df9e5b5f9a76a2002d9ef8d9b0db88003a07a5b5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 4 Mar 2022 08:12:20 +
Subject: [PATCH v7] pg_walinspect

---
 contrib/Makefile |   1 +
 contrib/pg_walinspect/.gitignore |   4 +
 contrib/pg_walinspect/Makefile   |  26 +
 contrib/pg_walinspect/pg_walinspect--1.0.sql |  91 ++
 contrib/pg_walinspect/pg_walinspect.c| 949 +++
 contrib/pg_walinspect/pg_walinspect.control  |   5 +
 src/backend/access/transam/xlogreader.c  |  14 +-
 src/bin/pg_waldump/pg_waldump.c  |   5 +
 src/common/relpath.c |  18 +
 src/include/access/xlog.h|   2 +-
 src/include/access/xlog_internal.h   |   2 +-
 src/include/access/xlogreader.h  |   2 -
 src/include/common/relpath.h |   1 +
 13 files changed, 1109 insertions(+), 11 deletions(-)
 create mode 100644 contrib/pg_walinspect/.gitignore
 create mode 100644 contrib/pg_walinspect/Makefile
 create mode 100644 contrib/pg_walinspect/pg_walinspect--1.0.sql
 create mode 100644 contrib/pg_walinspect/pg_walinspect.c
 create mode 100644 contrib/pg_walinspect/pg_walinspect.control

diff --git a/contrib/Makefile b/contrib/Makefile
index e3e221308b..705c6fc36b 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -40,6 +40,7 @@ SUBDIRS = \
 		pgrowlocks	\
 		pgstattuple	\
 		pg_visibility	\
+		pg_walinspect	\
 		postgres_fdw	\
 		seg		\
 		spi		\
diff --git a/contrib/pg_walinspect/.gitignore b/contrib/pg_walinspect/.gitignore
new file mode 100644
index 00..5dcb3ff972
--- /dev/null
+++ b/contrib/pg_walinspect/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/contrib/pg_walinspect/Makefile b/contrib/pg_walinspect/Makefile
new file mode 100644
index 00..c92a97447f
--- /dev/null
+++ b/contrib/pg_walinspect/Makefile
@@ -0,0 +1,26 @@
+# contrib/pg_walinspect/Makefile
+
+MODULE_big = pg_walinspect
+OBJS = \
+	$(WIN32RES) \
+	pg_walinspect.o
+PGFILEDESC = "pg_walinspect - functions to inspect contents of PostgreSQL Write-Ahead Log"
+
+PG_CPPFLAGS = -I$(libpq_srcdir)
+SHLIB_LINK_INTERNAL = $(libpq)
+
+EXTENSION = pg_walinspect
+DATA = pg_walinspect--1.0.sql
+
+REGRESS = pg_walinspect
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_walinspect
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_walinspect/pg_walinspect--1.0.sql b/contrib/pg_walinspect/pg_walinspect--1.0.sql
new file mode 100644
index 00..619b1d1d4a
--- /dev/null
+++ b/contrib/pg_walinspect/pg_walinspect--1.0.sql
@@ -0,0 +1,91 @@
+/* contrib/pg_walinspect/pg_walinspect--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_walinspect" to load this file. \quit
+
+--
+-- pg_get_raw_wal_record()
+--
+CREATE FUNCTION pg_get_raw_wal_record(IN in_lsn pg_lsn,
+OUT lsn pg_lsn,
+OUT record bytea
+)
+AS 'MODULE_PATHNAME', 'pg_get_raw_wal_record'
+LANGUAGE C CALLED ON NULL INPUT PARALLEL SAFE;
+
+GRANT EXECUTE ON FUNCTION pg_get_raw_wal_record(pg_lsn) TO pg_monitor;
+
+--
+-- pg_get_first_valid_wal_record_lsn()

Re: Doc about how to set max_wal_senders when setting minimal wal_level

2022-03-04 Thread Japin Li


On Fri, 04 Mar 2022 at 14:05, Kyotaro Horiguchi  wrote:
> At Fri, 04 Mar 2022 12:18:29 +0800, Japin Li  wrote in 
>> 
>> On Thu, 03 Mar 2022 at 12:10, Japin Li  wrote:
>>
>> Attach v3 patch to fix missing close varname tag.
>
> +A precondition for using minimal WAL is to disable WAL archiving and
> +streaming replication by setting  linkend="guc-max-wal-senders"/>
> +to 0, and archive_mode
> +to off.
>
> It is a bit odd that the features to stop and the corresponding GUCs
> are written irrespectively. It would be better they're in the same
> order.
>

Thanks for your review.  Modified.

> +servers.  If setting max_wal_senders to
> +0 consider also reducing the amount of WAL 
> produced
> +by changing wal_level to 
> minimal.
>
> Those who anively follow this suggestion may bump into failure when
> arhive_mode is on.  Thus archive_mode is also worth referred to.  But,
> anyway, IMHO, it is mere a performance tips that is not necessarily
> required in this section, or even in this documentaiotn.  Addtion to
> that, if we write this for max_wal_senders, archive_mode will deserve
> the similar tips but I think it is too verbose.  In short, I think I
> would not add that description at all.
>

It already has a tip about wal_level for archive_mode [1], IIUC.

archive_mode cannot be enabled when wal_level is set to minimal.


[1] 
https://www.postgresql.org/docs/devel/runtime-config-wal.html#GUC-ARCHIVE-MODE

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7ed8c82a9d..a70adb7030 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2775,6 +2775,10 @@ include_dir 'conf.d'
 minimal makes any base backups taken before
 unavailable for archive recovery and standby server, which may
 lead to data loss.
+A precondition for using minimal WAL is to disable WAL archiving and
+streaming replication by setting archive_mode to
+off, and  to
+0.


 In logical level, the same information is logged as
@@ -3535,6 +3539,7 @@ include_dir 'conf.d'
 mode. In always mode, all files restored from the archive
 or streamed with streaming replication will be archived (again). See
  for details.
+The default value is off.


 This parameter can only be set at server start.
@@ -4096,7 +4101,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
 reconnect.  This parameter can only be set at server start.  Also,
 wal_level must be set to
 replica or higher to allow connections from standby
-servers.
+servers.  If setting max_wal_senders to
+0 consider also reducing the amount of WAL produced
+by changing wal_level to minimal.

Regression tests failures on Windows Server 2019 - on master at commit # d816f366b

2022-03-04 Thread Bharath Rupireddy

Hi,

It looks like regression tests are failing on Windows Server 2019 [1].
Ignore if it's reported elsewhere.

[1] https://github.com/postgres/postgres/runs/5419324953
❌ 00:02 test_pl

[08:28:53.758]
[08:28:53.758] c:\cirrus>call "C:/Program
Files/Git/usr/bin/timeout.exe" -v -k60s 15m perl
src/tools/msvc/vcregress.pl plcheck
[08:28:53.953] 
[08:28:53.953] Checking plpgsql
[08:28:53.977] (using postmaster on Unix socket, default port)
[08:28:53.977] == dropping database "pl_regression"
==
[08:28:56.044] psql: error: connection to server at "localhost" (::1),
port 5432 failed: Connection refused (0x274D/10061)
[08:28:56.044] Is the server running on that host and accepting TCP/IP
connections?
[08:28:56.044] connection to server at "localhost" (127.0.0.1), port
5432 failed: Connection refused (0x274D/10061)
[08:28:56.044] Is the server running on that host and accepting TCP/IP
connections?
[08:28:56.050] command failed: "c:/cirrus/Debug/psql/psql" -X -c "SET
client_min_messages = warning" -c "DROP DATABASE IF EXISTS
\"pl_regression\"" "postgres"
[08:28:56.063]
[08:28:56.063] c:\cirrus>if 2 NEQ 0 exit /b 2
[08:28:56.084]
[08:28:56.084] Exit status: 2

Regards,
Bharath Rupireddy.

[Proposal] vacuumdb --schema only

2022-03-04 Thread Gilles Darold


Hi,


When we want to vacuum and/or analyze all tables in a dedicated schema, 
let's say pg_catalog for example, there is no easy way to do that. The 
VACUUM command doesn't allow it so we have to use \gexec or a SQL script 
to do that. We have an external command vacuumdb that could be used to 
simplify this task. For example the following command can be used to 
clean all tables stored in the pg_catalog schema:


    vacuumdb --schema pg_catalog -d foo

The attached patch implements that. Option -n | --schema can be used 
multiple time and can not be used together with options -a or -t.



Common use cases are an application that creates lot of temporary 
objects then drop them which can bloat a lot the catalog or which have 
heavy work in some schemas only. Of course the good practice is to find 
the bloated tables and execute VACUUM on each table but if most of the 
tables in the schema are regularly bloated the use of the vacuumdb 
--schema script can save time.



I do not propose to extend the VACUUM and ANALYZE commands because their 
current syntax doesn't allow me to see an easy way to do that and also 
because I'm not really in favor of such change. But if there is interest 
in improving these commands I will be pleased to do that, with the 
syntax suggested.



Best regards,

--
Gilles Darold
diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 956c0f01cb..e4f6d32ba9 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -39,6 +39,24 @@ PostgreSQL documentation
dbname
   
 
+  
+   vacuumdb
+   connection-option
+   option
+
+   
+
+ 
+  -n
+  --schema
+ 
+ schema
+
+   
+
+   dbname
+  
+
   
vacuumdb
connection-option
@@ -244,6 +262,17 @@ PostgreSQL documentation
   
  
 
+ 
+  -n schema
+  --schema=schema
+  
+   
+Clean or analyze all tables in schema only.
+Multiple schemas can be vacuumed by writing multiple -n switches.
+   
+  
+ 
+
  
   --no-index-cleanup
   
@@ -619,6 +648,14 @@ PostgreSQL documentation
 $ vacuumdb --analyze --verbose --table='foo(bar)' xyzzy
 
 
+   
+To clean all tables in the Foo and bar schemas
+only in a database named xyzzy:
+
+$ vacuumdb --schema='"Foo"' --schema='bar' xyzzy
+
+
+
  
 
  
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 96a818a3c1..4c4f47e32a 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -103,6 +103,8 @@ $node->safe_psql(
   CREATE TABLE funcidx (x int);
   INSERT INTO funcidx VALUES (0),(1),(2),(3);
   CREATE INDEX i0 ON funcidx ((f1(x)));
+  CREATE SCHEMA "Foo";
+  CREATE TABLE "Foo".bar(id int);
 |);
 $node->command_ok([qw|vacuumdb -Z --table="need""q(uot"(")x") postgres|],
 	'column list');
@@ -146,5 +148,12 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '--min-xid-age', '2147483001', 'postgres' ],
 	qr/GREATEST.*relfrozenxid.*2147483001/,
 	'vacuumdb --table --min-xid-age');
+$node->issues_sql_like(
+	[ 'vacuumdb', '--schema', '"Foo"', 'postgres' ],
+	qr/VACUUM "Foo".*/,
+	'vacuumdb --schema schema only');
+$node->command_fails(
+	[ 'vacuumdb',   '-n', 'pg_catalog', '-t pg_class', 'postgres' ],
+	'cannot vacuum all tables in schema(s) and specific table(s) at the same time');
 
 done_testing();
diff --git a/src/bin/scripts/t/101_vacuumdb_all.pl b/src/bin/scripts/t/101_vacuumdb_all.pl
index 1dcf411767..b122c995b1 100644
--- a/src/bin/scripts/t/101_vacuumdb_all.pl
+++ b/src/bin/scripts/t/101_vacuumdb_all.pl
@@ -15,5 +15,8 @@ $node->issues_sql_like(
 	[ 'vacuumdb', '-a' ],
 	qr/statement: VACUUM.*statement: VACUUM/s,
 	'vacuum all databases');
+$node->command_fails(
+	[ 'vacuumdb', '-a',  '-n', 'pg_catalog' ],
+	'cannot vacuum specific schema(s) in all databases');
 
 done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 4f6917fd39..69b470598f 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,10 +46,10 @@ typedef struct vacuumingOptions
 	bool		process_toast;
 } vacuumingOptions;
 
-
 static void vacuum_one_database(ConnParams *cparams,
 vacuumingOptions *vacopts,
 int stage,
+SimpleStringList *schemas,
 SimpleStringList *tables,
 int concurrentCons,
 const char *progname, bool echo, bool quiet);
@@ -94,6 +94,7 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"jobs", required_argument, NULL, 'j'},
 		{"parallel", required_argument, NULL, 'P'},
+		{"schema", required_argument, NULL, 'n'},
 		{"maintenance-db", required_argument, NULL, 2},
 		{"analyze-in-stages", no_argument, NULL, 3},
 		{"disable-page-skipping", no_argument, NULL, 4},
@@ -125,6 +126,7 @@ main(int argc, char *argv[])
 	SimpleStringList tables = {NULL, NULL};
 	int			concurrentCons = 1;
 	int			tbl_count = 0;
+	SimpleStringList schemas = {NULL, NULL};
 
 	/* initialize options */
 	mem

Re: pg_tablespace_location() failure with allow_in_place_tablespaces

2022-03-04 Thread Kyotaro Horiguchi

At Fri, 04 Mar 2022 17:28:45 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> > By the way, regardless of the patch, I got an error from pg_basebackup
> > for an in-place tablespace.  pg_do_start_backup calls readlink
> > believing pg_tblspc/* is always a symlink.
> > 
> > # Running: pg_basebackup -D 
> > /home/horiguti/work/worktrees/tsp_replay_2/src/test/recovery/tmp_check/t_029_replay_tsp_drops_primary1_data/backup/my_backup
> >  -h /tmp/X8E4nbF4en -p 51584 --checkpoint fast --no-sync
> > WARNING:  could not read symbolic link "pg_tblspc/16384": Invalid argument
> 
> So now we know that there are three places that needs the same
> processing.
> 
> pg_tablespace_location: this patch tries to fix
> sendDir:it already supports  in-place tsp
> do_pg_start_backup: not supports in-place tsp yet.

And I made a quick hack on do_pg_start_backup.  And I found that
pg_basebackup copies in-place tablespaces under the *current
directory*, which is not ok at all:(

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: psql - add SHOW_ALL_RESULTS option

2022-03-04 Thread Peter Eisentraut


On 15.01.22 10:00, Fabien COELHO wrote:
The reason this test constantly fails on cfbot windows is a 
use-after-free

bug.


Indeed! Thanks a lot for the catch and the debug!

The ClearOrSaveResult function is quite annoying because it may or may 
not clear the result as a side effect.


Attached v14 moves the status extraction before the possible clear. I've 
added a couple of results = NULL after such calls in the code.


Are you planning to send a rebased patch for this commit fest?

Re: refactoring basebackup.c

2022-03-04 Thread Dipesh Pandit

Hi,

> > It will be good if we can also fix
> > CreateWalTarMethod to support LZ4 and ZSTD.
> Ok we will see, either Dipesh or I will take care of it.

I took a look at the CreateWalTarMethod to support LZ4 compression
for WAL files. The current implementation involves a 3 step to backup
a WAL file to a tar archive. For each file:

   1. It first writes the header in the function tar_open_for_write,
   flushes the contents of tar to disk and stores the header offset.
   2.  Next, the contents of WAL are written to the tar archive.
   3. In the end, it recalculates the checksum in function tar_close() and
   overwrites the header at an offset stored in step #1.

The need for overwriting header in CreateWalTarMethod is mainly related to
partial WAL files where the size of the WAL file < WalSegSize. The file is
being
padded and checksum is recalculated after adding pad bytes.

If we go ahead and implement LZ4 support for CreateWalTarMethod then
we have a problem here at step #3. In order to achieve better compression
ratio, compressed LZ4 blocks are linked to each other and these blocks
are decoded sequentially. If we overwrite the header as part of step #3 then
it corrupts the link between compressed LZ4 blocks. Although LZ4 provides
an option to write the compressed block independently (using blockMode
option set to LZ4F_blockIndepedent) but it is still a problem because we
don't
know if overwriting the header after recalculating the checksum will not
overlap
the boundary of the next block.

GZIP manages to overcome this problem as it provides an option to turn
on/off
compression on the fly while writing a compressed archive with the help of
zlib
library function deflateParams(). The current gzip implementation for
CreateWalTarMethod uses this library function to turn off compression just
before
step #1 and it writes the uncompressed header of size equal to
TAR_BLOCK_SIZE.
It uses the same library function to turn on the compression for writing
the contents
of the WAL file as part of step #2. It again turns off the compression just
before step
#3 to overwrite the header. The header is overwritten at the same offset
with size
equal to TAR_BLOCK_SIZE.

Since GZIP provides this option to enable/disable compression, it is
possible to
control the size of data we are writing to a compressed archive. Even if we
overwrite
an already written block in a compressed archive there is no risk of it
overlapping
with the boundary of the next block. This mechanism is not available in LZ4
and ZSTD.

In order to support LZ4 and ZSTD compression for CreateWalTarMethod we may
need to refactor this code unless I am missing something. We need to
somehow
add the padding bytes in case of partial WAL before we send it to the
compressed
archive. This will make sure that all files which are being compressed does
not
require any padding as the size is always equal to WalSegSize. There is no
need to
recalculate the checksum and we can avoid overwriting the header as part of
step #3.

Thoughts?

Thanks,
Dipesh

Re: pg_tablespace_location() failure with allow_in_place_tablespaces

2022-03-04 Thread Kyotaro Horiguchi

At Fri, 04 Mar 2022 16:54:49 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Fri, 04 Mar 2022 16:41:03 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > At Fri, 4 Mar 2022 15:44:22 +0900, Michael Paquier  
> > wrote in 
> > > The use may be limited to any automated testing and
> > > allow_in_place_tablespaces is a developer GUC, still it seems to me
> > > that there is an argument to allow the case rather than tweak any
> > > tests to hardcode a path with the tablespace OID.  And any other code
> > > paths are able to handle such tablespaces, be they in recovery or in
> > > tablespace create/drop.
> > 
> > +1
> 
> By the way, regardless of the patch, I got an error from pg_basebackup
> for an in-place tablespace.  pg_do_start_backup calls readlink
> believing pg_tblspc/* is always a symlink.
> 
> # Running: pg_basebackup -D 
> /home/horiguti/work/worktrees/tsp_replay_2/src/test/recovery/tmp_check/t_029_replay_tsp_drops_primary1_data/backup/my_backup
>  -h /tmp/X8E4nbF4en -p 51584 --checkpoint fast --no-sync
> WARNING:  could not read symbolic link "pg_tblspc/16384": Invalid argument

So now we know that there are three places that needs the same
processing.

pg_tablespace_location: this patch tries to fix
sendDir:it already supports  in-place tsp
do_pg_start_backup: not supports in-place tsp yet.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: Logical replication timeout problem

2022-03-04 Thread kuroda.hay...@fujitsu.com

Dear Wang,

> Some codes were added in ReorderBufferProcessTXN() for treating DDL,

My mailer went wrong, so I'll put comments again. Sorry.

Some codes were added in ReorderBufferProcessTXN() for treating DDL,
but I doubted updating accept_writes is needed.
IMU, the parameter is read by OutputPluginPrepareWrite() in order align 
messages.
They should have a header - like 'w' - before their body. But here only a 
keepalive message is sent,
no meaningful changes, so I think it might be not needed.
I commented out the line and tested like you did [1], and no timeout and errors 
were found.
Do you have any reasons for that?

https://www.postgresql.org/message-id/OS3PR01MB6275A95FD44DC6C46058EA389E3B9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

73 matches

Mail list logo