Re: [PATCH] Prefer getenv("HOME") to find the UNIX home directory

2021-10-19 Thread Kyotaro Horiguchi
At Tue, 19 Oct 2021 02:44:03 -0700, Anders Kaseorg  wrote in 
> On 10/19/21 01:34, Kyotaro Horiguchi wrote:
> > I tend to agree to this, but seeing ssh ignoring $HOME, I'm not sure
> > it's safe that we follow the variable at least when accessing
> > confidentiality(?) files.  Since I don't understand the exact
> > reasoning for the ssh's behavior so it's just my humbole opinion.
> 
> According to https://bugzilla.mindrot.org/show_bug.cgi?id=3048#c1, it
> used to be supported to install the ssh binary as setuid.  A
> setuid/setgid binary needs to treat all environment variables with
> suspicion: if it can be convinced to write a file to $HOME with root
> privileges, then a user who modifies $HOME before invoking the binary
> could cause it to write to a file that the user normally couldn’t.
> 
> There’s no such concern for a binary that isn’t setuid/setgid.  Anyone
> with the ability to modify $HOME can be assumed to already have full
> control of the user account.

Thansk for the link.  Still I'm not sure it's the fact but it sounds
reasonable enough.  If that's the case, I vote +1 for psql or other
commands honoring $HOME.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Re: TAP test for recovery_end_command

2021-10-19 Thread Michael Paquier
On Wed, Oct 06, 2021 at 06:49:10PM +0530, Amul Sul wrote:
> Thanks for the suggestion, added the same in the attached version.

Hmm.  The run-time of 020_archive_status.p bumps from 4.7s to 5.8s on
my laptop, so the change is noticeable.  I agree that it would be good
to have more coverage for those commands, but I also think that we
should make things cheaper if we can, particularly knowing that those
commands just touch a file.  This patch creates two stanbys for its
purpose, but do we need that much?

On top of that, 020_archive_status.pl does not look like the correct
place for this set of tests.  002_archiving.pl would be a better
candidate, where we already have two standbys that get promoted, so
you could have both the failure and success cases there.  There should
be no need for extra wait phases either.

+$standby4->append_conf('postgresql.conf',
+   "recovery_end_command = 'echo recovery_ended > /non_existing_dir/file'");
I am wondering how this finishes on Windows.
--
Michael


signature.asc
Description: PGP signature


Re: LogicalChanges* and LogicalSubxact* wait events are never reported

2021-10-19 Thread Michael Paquier
On Wed, Oct 20, 2021 at 02:12:20PM +0900, Masahiro Ikeda wrote:
> If my understanding is right, it's better to remove them since they make
> users confused. Please see the attached patch. I confirmed that to make
> check-world passes all tests.

Yeah, I don't see the point in keeping these events around if they are
not used.  Perhaps Amit has some plans for them, though.
--
Michael


signature.asc
Description: PGP signature


Re: pg_receivewal starting position

2021-10-19 Thread Michael Paquier
On Tue, Oct 19, 2021 at 05:32:55PM +0200, Ronan Dunklau wrote:
> Following recommendations, I stripped most of the features from the patch. 
> For 
> now we support only physical replication slots, and only provide the two 
> fields 
> of interest (restart_lsn, restart_tli) in addition to the slot type (fixed at 
> physical) to not paint ourselves in a corner.
> 
> I also removed the part about pg_basebackup since other fixes have been 
> proposed for that case. 

Patch 0001 looks rather clean.  I have a couple of comments.

+   if (OidIsValid(slot_contents.data.database))
+   elog(ERROR, "READ_REPLICATION_SLOT is only supported for physical 
slots");

elog() can only be used for internal errors.  Errors that can be
triggered by a user should use ereport() instead.

+ok($stdout eq '||',
+   "READ_REPLICATION_SLOT returns NULL values if slot does not exist");
[...]
+ok($stdout =~ 'physical\|[^|]*\|1',
+   "READ_REPLICATION_SLOT returns tuple corresponding to the slot");
Isn't result pattern matching something we usually test with like()?

+($ret, $stdout, $stderr) = $node_primary->psql(
+   'postgres',
+   "READ_REPLICATION_SLOT $slotname;",
+   extra_params => [ '-d', $connstr_rep ]);
No need for extra_params in this test.  You can just pass down
"replication => 1" instead, no?

--- a/src/test/recovery/t/006_logical_decoding.pl
+++ b/src/test/recovery/t/006_logical_decoding.pl
[...]
+ok($stderr=~ 'READ_REPLICATION_SLOT is only supported for physical slots',
+   'Logical replication slot is not supported');
This one should use like().

+
+The slot's restart_lsn can also beused as a starting
+point if the target directory is empty.
+
I am not sure that there is a need for this addition as the same thing
is said when describing the lookup ordering.

+  If nothing is found and a slot is specified, use the
+  READ_REPLICATION_SLOT
+  command.
It may be clearer to say that the position is retrieved from the
command.

+bool
+GetSlotInformation(PGconn *conn, const char *slot_name, XLogRecPtr
*restart_lsn, TimeLineID* restart_tli)
+{
Could you extend that so as we still run the command but don't crash
if the caller specifies NULL for any of the result fields?  This would
be handy.

+   if (PQgetisnull(res, 0, 0))
+   {
+   PQclear(res);
+   pg_log_error("replication slot \"%s\" does not exist",
slot_name);
+   return false;
+   }
+   if (PQntuples(res) != 1 || PQnfields(res) < 3)
+   {
+   pg_log_error("could not fetch replication slot: got %d rows
and %d fields, expected %d rows and %d or more fields",
+PQntuples(res), PQnfields(res), 1, 3);
+   PQclear(res);
+   return false;
+   }
Wouldn't it be better to reverse the order of these two checks?

I don't mind the addition of the slot type being part of the result of
READ_REPLICATION_SLOT even if it is not mandatory (?), but at least
GetSlotInformation() should check after it.

+# Setup the slot, and connect to it a first time
+$primary->run_log(
+   [ 'pg_receivewal', '--slot', $slot_name, '--create-slot' ],
+   'creating a replication slot');
+$primary->psql('postgres',
+   'INSERT INTO test_table VALUES (generate_series(1,100));');
+$primary->psql('postgres', 'SELECT pg_switch_wal();');
+$nextlsn =
+  $primary->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn();');
+chomp($nextlsn);
Wouldn't it be simpler to use CREATE_REPLICATION_SLOT with RESERVE_WAL
here, rather than going through pg_receivewal?  It seems to me that
this would be cheaper without really impacting the coverage.
--
Michael


signature.asc
Description: PGP signature


LogicalChanges* and LogicalSubxact* wait events are never reported

2021-10-19 Thread Masahiro Ikeda
Hi,

When I read the documents and source code of wait evens,
I found that the following wait events are never reported.

* LogicalChangesRead: Waiting for a read from a logical changes file.
* LogicalChangesWrite: Waiting for a write to a logical changes file.
* LogicalSubxactRead: Waiting for a read from a logical subxact file.
* LogicalSubxactWrite: Waiting for a write to a logical subxact file.


The wait events are introduced in the following patch.

 Add support for streaming to built-in logical replication.
 Amit Kapila on 2020/9/3 11:24:07
 464824323e57dc4b397e8b05854d779908b55304

I read the above discussion and found the wait events were reported at first.
But they seemed to be removed because they are not necessary because
BufFileWrite/BufFileRead are enough([1]).


If my understanding is right, it's better to remove them since they make
users confused. Please see the attached patch. I confirmed that to make
check-world passes all tests.

[1]
https://www.postgresql.org/message-id/CAA4eK1JV37jXUT5LeWzkBDNNnSntwQbLUZAj6m82QMiR1ZuuHQ%40mail.gmail.com

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7355835202..3173ec2566 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1549,22 +1549,6 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   WALWrite
   Waiting for a write to a WAL file.
  
- 
-  LogicalChangesRead
-  Waiting for a read from a logical changes file.
- 
- 
-  LogicalChangesWrite
-  Waiting for a write to a logical changes file.
- 
- 
-  LogicalSubxactRead
-  Waiting for a read from a logical subxact file.
- 
- 
-  LogicalSubxactWrite
-  Waiting for a write to a logical subxact file.
- 
 

   
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index ef7e6bfb77..4a5b7502f5 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -717,18 +717,6 @@ pgstat_get_wait_io(WaitEventIO w)
 		case WAIT_EVENT_WAL_WRITE:
 			event_name = "WALWrite";
 			break;
-		case WAIT_EVENT_LOGICAL_CHANGES_READ:
-			event_name = "LogicalChangesRead";
-			break;
-		case WAIT_EVENT_LOGICAL_CHANGES_WRITE:
-			event_name = "LogicalChangesWrite";
-			break;
-		case WAIT_EVENT_LOGICAL_SUBXACT_READ:
-			event_name = "LogicalSubxactRead";
-			break;
-		case WAIT_EVENT_LOGICAL_SUBXACT_WRITE:
-			event_name = "LogicalSubxactWrite";
-			break;
 
 			/* no default case, so that compiler will warn */
 	}
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 6007827b44..c22142365f 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -221,11 +221,7 @@ typedef enum
 	WAIT_EVENT_WAL_READ,
 	WAIT_EVENT_WAL_SYNC,
 	WAIT_EVENT_WAL_SYNC_METHOD_ASSIGN,
-	WAIT_EVENT_WAL_WRITE,
-	WAIT_EVENT_LOGICAL_CHANGES_READ,
-	WAIT_EVENT_LOGICAL_CHANGES_WRITE,
-	WAIT_EVENT_LOGICAL_SUBXACT_READ,
-	WAIT_EVENT_LOGICAL_SUBXACT_WRITE
+	WAIT_EVENT_WAL_WRITE
 } WaitEventIO;
 
 


Re: pgsql: Document XLOG_INCLUDE_XID a little better

2021-10-19 Thread Dilip Kumar
On Mon, Oct 18, 2021 at 10:48 AM Amit Kapila  wrote:

>
> Today, I have looked at this patch again and slightly changed a
> comment, one of the function name and variable name. Do, let me know
> if you or others have any suggestions for better names or otherwise? I
> think we should backpatch this to 14 as well where this code was
> introduced.
>

 bool
-IsSubTransactionAssignmentPending(void)
+IsTopTransactionIdLogged(void)
 {
  /* wal_level has to be logical */
  if (!XLogLogicalInfoActive())
@@ -6131,19 +6131,20 @@ IsSubTransactionAssignmentPending(void)
  if (!TransactionIdIsValid(GetCurrentTransactionIdIfAny()))
  return false;

- /* and it should not be already 'assigned' */
- return !CurrentTransactionState->assigned;
+ /* and it should not be already 'logged' */
+ return !CurrentTransactionState->topXidLogged;
 }

I have one comment here, basically, you have changed the function name
to "IsTopTransactionIdLogged", but it still behaves like
IsTopTransactionIdLogPending.  Now with the new name, it should return
(CurrentTransactionState->topXidLogged) instead of
(!CurrentTransactionState->topXidLogged).

And the caller should also be changed accordingly.  Other changes look fine.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable

2021-10-19 Thread Michael Paquier
On Tue, Oct 19, 2021 at 02:40:04PM +, Bossart, Nathan wrote:
> On 10/18/21, 11:49 PM, "Matthijs van der Vleuten"  wrote:
>> The test case doesn't seem entirely correct to me? The index being
>> dropped (btree_tall_tbl_idx2) doesn't exist.
> 
> This was fixed before it was committed [0].

Yes, my apologies about this brain fade.  The committed code is
hopefully fine :)

>> Also, I don't believe this tests the case of dropping the index when
>> it previously has been altered in this way.
> 
> That can still fail with the "has no options" ERROR, and fixing it
> will still require a manual catalog update.  The ERROR is actually
> coming from the call to index_open(), so bypassing it might require
> some rather intrusive changes.  Given that it took over a year for
> this bug to be reported, I suspect it might be more trouble than it's
> worth.

This may need a mention in the release notes, but the problem is I
guess not that spread enough to worry or people would have complained
more since 13 was out.  Logical dumps discard that automatically and
even for the ANALYZE case, the pre-committed code would have just
ignored the reloptions retrieved by get_attribute_options().
--
Michael


signature.asc
Description: PGP signature


Re: Fixing build of MSVC with OpenSSL 3.0.0

2021-10-19 Thread Michael Paquier
On Tue, Oct 19, 2021 at 01:09:28PM +0200, Daniel Gustafsson wrote:
> The other proposal, making sure that we don't see a version 2.x.x creep in (in
> case a packager decides to play cute like how has happened in other OS's) seem
> sane to me, but I'm also not very well versed in Windows so you be the judge
> there.

If that happens to become a problem, I'd rather wait and see when/if
we reach this point.  For the MSIs the PG docs point to, the first
patch is more than enough.
--
Michael


signature.asc
Description: PGP signature


Re: Skipping logical replication transactions on subscriber side

2021-10-19 Thread Greg Nancarrow
On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada  wrote:
>
> I've attached updated patches that incorporate all comments I got so far.
>

Minor comment on patch 17-0003

src/backend/replication/logical/worker.c

(1) Typo in apply_handle_stream_abort() comment:

/* Stop skipping transaction transaction, if enabled */
should be:
/* Stop skipping transaction changes, if enabled */


Regards,
Greg Nancarrow
Fujitsu Australia




Re: Postgres perl module namespace

2021-10-19 Thread Michael Paquier
On Tue, Oct 19, 2021 at 10:16:06PM +0200, Erik Rijkers wrote:
>> [0001-move-perl-test-modules-to-PostgreSQL-Test-namespace.patch ]
>> [0002-move-PostgreSQL-Test-PostgresVersion-up-in-the-names.patch]

It seems to me that the hardest part is sorted out with the naming and
pathing of the modules, so better to apply them sooner than later. 

> Those patches gave some complains about PostgreSQL/Test/PostgresVersion.pm
> being absent so I added this deletion.  I'm not sure that's correct but it
> enabled the build and check-world ran without errors.

Your change is incorrect, as we want to install PostgresVersion.pm.
What's needed here is the following:
{PostgresVersion.pm => PostgreSQL/Version.pm}

And so the patch needs to be changed like that:
-   $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/PostgresVersion.pm 
'$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/PostgresVersion.pm'
+   $(INSTALL_DATA) $(srcdir)/PostgreSQL/Version.pm 
'$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm'
[...]
-   rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/PostgresVersion.pm'
+   rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm'
--
Michael


signature.asc
Description: PGP signature


RE: Skipping logical replication transactions on subscriber side

2021-10-19 Thread houzj.f...@fujitsu.com
On Mon, Oct 18, 2021 9:34 AM Masahiko Sawada  wrote:
> I've attached updated patches that incorporate all comments I got so far.

Hi,

Here are some minor comments for the patches.

v17-0001-Add-a-subscription-errors-statistics-view-pg_sta.patch

1)

+   /* Clean up */
+   if (not_ready_rels != NIL)
+   list_free_deep(not_ready_rels);

Maybe we don't need the ' if (not_ready_rels != NIL)' check as
list_free_deep will do this check internally.

2)

+   for (int i = 0; i < msg->m_nentries; i++)
+   {
+   HASH_SEQ_STATUS sstat;
+   PgStat_StatSubWorkerEntry *wentry;
+
+   /* Remove all worker statistics of the subscription */
+   hash_seq_init(, subWorkerStatHash);
+   while ((wentry = (PgStat_StatSubWorkerEntry *) 
hash_seq_search()) != NULL)
+   {
+   if (wentry->key.subid == msg->m_subids[i])
+   (void) hash_search(subWorkerStatHash, (void *) 
&(wentry->key),
+  HASH_REMOVE, 
NULL);

Would it be a little faster if we scan hashtable in outerloop and
scan the msg in innerloop ?
Like:
while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search()) != NULL)
{
for (int i = 0; i < msg->m_nentries; i++)
...


v17-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command

1)
I noticed that we cannot RESET slot_name while we can SET it.
And the slot_name have a default behavior that " use the name of the 
subscription for the slot name.".
So, is it possible to support RESET it ?

Best regards,
Hou zj



Re: [RFC] building postgres with meson

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-19 18:49:43 -0700, Andres Freund wrote:
> I wonder when it'll be faster to run 32bit ppc via qemu than natively :)

Freebsd didn't seem to want to boot, but surprisingly a debian buster image
started at least the installer without problems... Will probably take a while
to see if it actually works.

I assume to make it acceptable from a build-speed perspective one would have
to use distcc with the compiler running outside.

Greetings,

Andres Freund




Re: pg_upgrade test chatter

2021-10-19 Thread Alvaro Herrera
On 2021-Oct-19, Tom Lane wrote:

> Yeah, my original thought had been to hack this at the test level.
> However, I felt like it'd be worth adding this code because we could
> apply it elsewhere in pg_regress.c to save several psql sessions
> (and hence backend starts) per regression DB creation.  That's not a
> huge win, but it'd add up.

Ah, yeah, that sounds like it can be significant under valgrind and
such, so +1.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)




Re: pg_upgrade test chatter

2021-10-19 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Oct-19, Tom Lane wrote:
>> We could dodge that, with modern versions of psql, by issuing
>> two -c switches.

> Isn't it easier to pass client_min_messages via PGOPTIONS?

> PGOPTIONS="-c client_min_messages=warning" psql -c "drop database if exists 
> foo"

Yeah, my original thought had been to hack this at the test level.
However, I felt like it'd be worth adding this code because we could
apply it elsewhere in pg_regress.c to save several psql sessions
(and hence backend starts) per regression DB creation.  That's not a
huge win, but it'd add up.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-19 Thread Tom Lane
Andres Freund  writes:
> On 2021-10-19 21:26:53 -0400, Tom Lane wrote:
>> As with the HPPA, a potential compromise is to spin up some newer
>> BSD-ish system on it.  I agree that OSX 10.4 is uninteresting as a
>> software platform, but I'd like to keep 32-bit PPC represented in
>> the farm.

> I assume the reason 32-bit PPC is interesting is that it's commonly run big
> endian?

Aside from bit width and endianness, I believe it's a somewhat smaller
instruction set than the newer CPUs.

> I wonder when it'll be faster to run 32bit ppc via qemu than natively :)

I think qemu would have a ways to go for that.  More to the point,
I've found that its emulation is not as precise as one might wish...

regards, tom lane




Re: pg_upgrade test chatter

2021-10-19 Thread Alvaro Herrera
On 2021-Oct-19, Tom Lane wrote:

> I tried doing this as a one-liner change in pg_regress's
> drop_database_if_exists(), but the idea fell over pretty
> quickly, because what underlies that is a "psql -c" call:
> 
> $ psql -c 'set client_min_messages = warning; drop database if exists foo'
> ERROR:  DROP DATABASE cannot run inside a transaction block
> 
> We could dodge that, with modern versions of psql, by issuing
> two -c switches.

Isn't it easier to pass client_min_messages via PGOPTIONS?

PGOPTIONS="-c client_min_messages=warning" psql -c "drop database if exists foo"


-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/




Re: [RFC] building postgres with meson

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-19 21:26:53 -0400, Tom Lane wrote:
> My notes say
> 
> Currently running OSX 10.4.11 (last release of Tiger); although 10.5 
> Leopard
> supports PPCs, it refuses to install if CPU speed < 867MHz, well beyond 
> the
> Cube's ability.  Wikipedia does suggest it's possible to run Leopard, 
> but...
> 
> https://en.wikipedia.org/wiki/Mac_OS_X_Leopard#Usage_on_unsupported_hardware
> 
> I'm not sure that I have install media for 10.5 anymore, either --- ISTR
> some machine's CD drive failing and not letting me get the CD back out.
> If I did have it, I don't think there'd be a way to update past 10.5.0
> (surely Apple no longer has those updaters on-line?), so on the whole
> I think that path is a nonstarter.

That does indeed sound like a nonstarter.


> I do have 10.5 running on an old G4 PowerMac, but that machine is (a)
> noisy (b) power-hungry and (c) getting flaky, so I'm uneager to spin up
> a buildfarm animal on it.

Understandable.


> As with the HPPA, a potential compromise is to spin up some newer
> BSD-ish system on it.  I agree that OSX 10.4 is uninteresting as a
> software platform, but I'd like to keep 32-bit PPC represented in
> the farm.

I assume the reason 32-bit PPC is interesting is that it's commonly run big
endian?

I wonder when it'll be faster to run 32bit ppc via qemu than natively :)

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-19 17:31:22 -0700, Andres Freund wrote:
> I also need to make meson use our flex wrapper for the relevant versions... I
> can see the warning that'd be fixed by it on macos CI. Will do that and push
> it out to my github repo together with your changes.

That turned out to be more work than I anticipated, so I pushed your changes
out separately.

There's this bit in plflex.pl that talks about adjusting yywrap() for msvc. I
didn't implement that and didn't see any compilation problems. Looks like that
originally hails from 2011, in 08a0c2dabc3b9d59d72d7a79ed867b8e37d275a7

Hm. Seems not worth carrying forward unless it actually causes trouble?

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-19 Thread Tom Lane
Andres Freund  writes:
> On 2021-10-19 15:22:15 -0400, Tom Lane wrote:
>> I'm more concerned about the effort involved in getting meson going on some
>> other old animals, such as prairiedog.

> Yea, that's an *old* OS version. One version too old to have support for
> @rpath, added in 10.5 :(. Is there a reason to run 10.4 specifically?
> According to wikipedia 10.5 is the last version to support ppc.

My notes say

Currently running OSX 10.4.11 (last release of Tiger); although 10.5 Leopard
supports PPCs, it refuses to install if CPU speed < 867MHz, well beyond the
Cube's ability.  Wikipedia does suggest it's possible to run Leopard, but...
https://en.wikipedia.org/wiki/Mac_OS_X_Leopard#Usage_on_unsupported_hardware

I'm not sure that I have install media for 10.5 anymore, either --- ISTR
some machine's CD drive failing and not letting me get the CD back out.
If I did have it, I don't think there'd be a way to update past 10.5.0
(surely Apple no longer has those updaters on-line?), so on the whole
I think that path is a nonstarter.

I do have 10.5 running on an old G4 PowerMac, but that machine is (a)
noisy (b) power-hungry and (c) getting flaky, so I'm uneager to spin up
a buildfarm animal on it.

As with the HPPA, a potential compromise is to spin up some newer
BSD-ish system on it.  I agree that OSX 10.4 is uninteresting as a
software platform, but I'd like to keep 32-bit PPC represented in
the farm.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-19 Thread Tom Lane
Andres Freund  writes:
> I wish we had a bit more consistency in the flags, so we could centralize
> them. Seems there's no reason to not use -p -p and -b everywhere?

I don't think we care enough about performance of most of the scanners
to make them all backup-free, so -1 to that idea.

We could possibly replace the command line switches with %option
entries in the files themselves.  But I think the reason we haven't
done so for -b is that the Makefile still needs to know about it
so as to know what to do with the lex.backup output file.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-19 15:22:15 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2021-10-12 01:37:21 -0700, Andres Freund wrote:
> >> As far as I can tell the only OS that postgres currently supports that
> >> meson doesn't support is HPUX. It'd likely be fairly easy to add
> >> gcc-on-hpux support, a chunk more to add support for the proprietary
> >> ones.
>
> > Tom, wrt HPUX on pa-risc, what are your thoughts there? IIRC we gave up
> > supporting HP's compiler on pa-risc a while ago.
>
> Right.  I am still testing with gcc on HP-PA.  I'd kind of like to
> keep it running just as an edge case for our spinlock support, but
> I'm not sure that I want to do any huge amount of work on meson
> to keep that going.

Makes sense.  While that does test an odd special case for our spinlock
implementation, it's also the only supported platform with that edge case, and
it seems extremely unlikely that there ever will be a new platform with such
odd/limited atomic operations.


> I do have a functioning OpenBSD installation on that machine, so
> one alternative if the porting costs look too high is to replace
> gaur with an OpenBSD animal. However, last I checked, OpenBSD
> was about half the speed of HPUX on that hardware, so I'm not
> real eager to go that way.  gaur's already about the slowest
> animal in the farm :-(

Yea, that doesn't sound enticing. Seems like we either should keep it running
on hp-ux or just drop parisc support?


> > As I said it'd probably not be too hard to add meson support for hpux on 
> > hppa,
> > it's probably just a few branches. But that'd require access somewhere. The
> > gcc compile farm does not have a hppa member anymore...
>
> If you've got an idea where to look, I could add that to my to-do queue.

It might even just work. Looks like meson does have pa-risc detection. While
it doesn't have any specifically for hpux, it just falls back to python's
sys.platform in that case. python3 -c 'import sys;print(sys.platform)'

meson generates output for ninja to execute (basically a faster make that's
partially faster by being much less flexible. Intended to be output by more
user-friendly buildsystems ). Ninja can be built by a minimal python script,
or with cmake. The former doesn't seem to have hpux support, the latter does I
think.
https://github.com/ninja-build/ninja

So it could be interesting to see if ninja builds.


I've not taught the PG meson the necessary stuff for a 32 bit build. So
there's no point is trying whether meson works that much. I'll try to do that,
and let you know.


> I'm more concerned about the effort involved in getting meson going on some
> other old animals, such as prairiedog.

Yea, that's an *old* OS version. One version too old to have support for
@rpath, added in 10.5 :(. Is there a reason to run 10.4 specifically?
According to wikipedia 10.5 is the last version to support ppc.

Looks like python still supports building back to 10.4.

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-19 17:57:31 -0400, John Naylor wrote:
> I know this is still in the evaluation stage, but I did notice some
> discrepencies in the Flex flags. With the attached patch, the read-only
> data segment seems to match up pretty well now.

Good catch. I think I just copied them around...

I wish we had a bit more consistency in the flags, so we could centralize
them. Seems there's no reason to not use -p -p and -b everywhere?


I also need to make meson use our flex wrapper for the relevant versions... I
can see the warning that'd be fixed by it on macos CI. Will do that and push
it out to my github repo together with your changes.

Thanks!

Andres




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-10-19 Thread Masahiko Sawada
On Wed, Oct 20, 2021 at 3:07 AM Alvaro Herrera  wrote:
>
> On 2021-Oct-19, Alvaro Herrera wrote:
>

Thank you for the comment.

> > Hmm, I think this should happen before the transaction snapshot is
> > established in the worker; perhaps immediately after calling
> > StartParallelWorkerTransaction(), or anyway not after
> > SetTransactionSnapshot.  In fact, since SetTransactionSnapshot receives
> > a 'sourceproc' argument, why not do it exactly there? ISTM that
> > ProcArrayInstallRestoredXmin() is where this should happen.
>
> ... and there is a question about the lock strength used for
> ProcArrayLock.  The current routine uses LW_SHARED, but there's no
> clarity that we can modify proc->statusFlags and ProcGlobal->statusFlags
> without LW_EXCLUSIVE.
>
> Maybe we can change ProcArrayInstallRestoredXmin so that if it sees that
> proc->statusFlags is not zero, then it grabs LW_EXCLUSIVE (and copies),
> otherwise it keeps using LW_SHARED as it does now (and does not copy.)

Initially, I've considered copying statusFlags in
ProcArrayInstallRestoredXmin() but I hesitated to do that because
statusFlags is not relevant with xmin and snapshot stuff. But I agree
that copying statusFlags should happen before restoring the snapshot.

If we copy statusFlags in ProcArrayInstallRestoredXmin() there is
still little window that the restored snapshot holds back the oldest
xmin? If so it would be better to call ProcArrayCopyStatusFlags()
right after StartParallelWorker().

> (This also suggests that using LW_EXCLUSIVE inconditionally for all
> cases as your patch does is not great.  OTOH it's just once at every
> bgworker start, so it's not *that* frequent.)

Agreed.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: ThisTimeLineID can be used uninitialized

2021-10-19 Thread Alvaro Herrera
On 2021-Oct-19, Andres Freund wrote:

> Hi,
> 
> On 2021-10-19 15:13:04 -0400, Robert Haas wrote:
> > This is a followup to
> > http://postgr.es/m/ca+tgmoz5a26c6oxkapafyuy_sx0vg6vxdd_q6asezsvrphd...@mail.gmail.com.
> > I'm suspicious of the following code in CreateReplicationSlot:
> > 
> > /* setup state for WalSndSegmentOpen */
> > sendTimeLineIsHistoric = false;
> > sendTimeLine = ThisTimeLineID;
> > 
> > The first thing that's odd about this is that if this is physical
> > replication, it's apparently dead code, because AFAICT sendTimeLine
> > will not be used for anything in that case.
> 
> It's quite confusing. It's *really* not helped by physical replication using
> but not really using an xlogreader to keep state. Which presumably isn't
> actually used during a physical CreateReplicationSlot(), but is referenced by
> a comment :/

Yeah, that's not very nice.  My preference would be to change physical
replication to use xlogreader in the regular way, and avoid confounding
backdoors like its current approach.

> > But it sure seems strange that the code would apparently work just
> > as well as it does today with the following patch:
> > 
> > diff --git a/src/backend/replication/walsender.c
> > b/src/backend/replication/walsender.c
> > index b811a5c0ef..44fd598519 100644
> > --- a/src/backend/replication/walsender.c
> > +++ b/src/backend/replication/walsender.c
> > @@ -945,7 +945,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
> > 
> >  /* setup state for WalSndSegmentOpen */
> >  sendTimeLineIsHistoric = false;
> > -sendTimeLine = ThisTimeLineID;
> > +sendTimeLine = rand() % 10;

Hah.  Yeah, when you can do things like that and the tests don't break,
that indicates a problem in the tests.

> Istm we should introduce an InvalidTimeLineID, and explicitly initialize
> sendTimeLine to that, and assert that it's valid / invalid in a bunch of
> places?

That's not a bad idea; it'll help discover bogus code.  Obviously, some
additional tests wouldn't harm -- we have a lot more coverage now than
in embarrasingly recent past, but it can still be improved.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Las mujeres son como hondas:  mientras más resistencia tienen,
 más lejos puedes llegar con ellas"  (Jonas Nightingale, Leap of Faith)




Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-19 Thread Stephen Frost
Greetings,

On Tue, Oct 19, 2021 at 18:26 Mark Dilger 
wrote:

>
>
> > On Oct 19, 2021, at 3:18 PM, Jeff Davis  wrote:
> >
> > On Tue, 2021-10-19 at 13:17 -0700, Mark Dilger wrote:
> >> Wouldn't it be much cleaner to have superuser bypass the trigger?
> >
> > Maybe it could be a user property like "BYPASS_EVENT_TRIGGERS", and
> > only superusers could adjust it (like the SUPERUSER and REPLICATION
> > properties).
> >
> > I suppose it would default to BYPASS_EVENT_TRIGGERS for superusers and
> > not for non-superusers. A little awkward to have different defaults,
> > but it seems sensible in this case.
> >
> > Would this bypass all event triggers, or only the event triggers of
> > another user?
>
> The difficulty is that non-superuser owned event triggers could be
> something of a minefield for scripts run as superuser.  The cleanest way
> around that would be to have them never fire in response to superuser
> actions.  Installations could still have event triggers that cover all
> users, including superusers, as long as they have those triggers owned by
> superuser.
>
> The implementation in the patch set does this, but with finer grained
> precision, because the universe of roles is divided into more than just
> superuser vs. non-superuser.


This last point is particularly important. Non-super users may be able to
become superuser and those roles which are able to need to also be
protected. Only protecting superuser roles themselves is *not* enough.

Thanks,

Stephen

>


Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-19 Thread Mark Dilger



> On Oct 19, 2021, at 3:18 PM, Jeff Davis  wrote:
> 
> On Tue, 2021-10-19 at 13:17 -0700, Mark Dilger wrote:
>> Wouldn't it be much cleaner to have superuser bypass the trigger?
> 
> Maybe it could be a user property like "BYPASS_EVENT_TRIGGERS", and
> only superusers could adjust it (like the SUPERUSER and REPLICATION
> properties).
> 
> I suppose it would default to BYPASS_EVENT_TRIGGERS for superusers and
> not for non-superusers. A little awkward to have different defaults,
> but it seems sensible in this case.
> 
> Would this bypass all event triggers, or only the event triggers of
> another user?

The difficulty is that non-superuser owned event triggers could be something of 
a minefield for scripts run as superuser.  The cleanest way around that would 
be to have them never fire in response to superuser actions.  Installations 
could still have event triggers that cover all users, including superusers, as 
long as they have those triggers owned by superuser.

The implementation in the patch set does this, but with finer grained 
precision, because the universe of roles is divided into more than just 
superuser vs. non-superuser.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: ALTER INDEX .. RENAME allows to rename tables/views as well

2021-10-19 Thread Bossart, Nathan
On 10/19/21, 3:13 PM, "Alvaro Herrera"  wrote:
> On 2021-Oct-19, Bossart, Nathan wrote:
>
>> I did consider this, but I figured it might be better to keep the lock
>> level consistent for a given object type no matter what the statement
>> type is.  I don't have a strong opinion about this, though.
>
> Yeah, the problem is that if there is a concurrent process waiting on
> your lock, we'll release ours and they'll grab theirs, so we'll be
> waiting on them afterwards, which is worse.

Makes sense.

> BTW I noticed that the case of partitioned indexes was wrong too.  I
> fixed that, added it to the tests, and pushed.

Ah, good catch.  Thanks!

Nathan



Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-19 Thread Jeff Davis
On Tue, 2021-10-19 at 13:17 -0700, Mark Dilger wrote:
> Wouldn't it be much cleaner to have superuser bypass the trigger?

Maybe it could be a user property like "BYPASS_EVENT_TRIGGERS", and
only superusers could adjust it (like the SUPERUSER and REPLICATION
properties).

I suppose it would default to BYPASS_EVENT_TRIGGERS for superusers and
not for non-superusers. A little awkward to have different defaults,
but it seems sensible in this case.

Would this bypass all event triggers, or only the event triggers of
another user?

Regards,
Jeff Davis






Re: ALTER INDEX .. RENAME allows to rename tables/views as well

2021-10-19 Thread Alvaro Herrera
On 2021-Oct-19, Bossart, Nathan wrote:

> I did consider this, but I figured it might be better to keep the lock
> level consistent for a given object type no matter what the statement
> type is.  I don't have a strong opinion about this, though.

Yeah, the problem is that if there is a concurrent process waiting on
your lock, we'll release ours and they'll grab theirs, so we'll be
waiting on them afterwards, which is worse.

BTW I noticed that the case of partitioned indexes was wrong too.  I
fixed that, added it to the tests, and pushed.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"People get annoyed when you try to debug them."  (Larry Wall)




Re: [RFC] building postgres with meson

2021-10-19 Thread John Naylor
Hi,

I know this is still in the evaluation stage, but I did notice some
discrepencies in the Flex flags. With the attached patch, the read-only
data segment seems to match up pretty well now.

-- 
John Naylor
EDB: http://www.enterprisedb.com


sync-flex-flags-with-autoconf-build.patch
Description: Binary data


Re: speed up verifying UTF-8

2021-10-19 Thread John Naylor
I've decided I'm not quite comfortable with the additional complexity in
the build system introduced by the SIMD portion of the previous patches. It
would make more sense if the pure C portion were unchanged, but with the
shift-based DFA plus the bitwise ASCII check, we have a portable
implementation that's still a substantial improvement over the current
validator. In v24, I've included only that much, and the diff is only about
1/3 as many lines. If future improvements to COPY FROM put additional
pressure on this path, we can always add SIMD support later.

One thing not in this patch is a possible improvement to
pg_utf8_verifychar() that Heikki and I worked on upthread as part of
earlier attempts to rewrite pg_utf8_verifystr(). That's worth looking into
separately.

On Thu, Aug 26, 2021 at 12:09 PM Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:
>
> >Attached is v23 incorporating the 32-bit transition table, with the
necessary comment adjustments
>
> 32bit table is nice.

Thanks for taking a look!

> Would you please replace
https://github.com/BobSteagall/utf_utils/blob/master/src/utf_utils.cpp URL
with
>
https://github.com/BobSteagall/utf_utils/blob/6b7a465265de2f5fa6133d653df0c9bdd73bbcf8/src/utf_utils.cpp
> in the header of src/port/pg_utf8_fallback.c?
>
> It would make the URL more stable in case the file gets renamed.
>
> Vladimir
>

Makes sense, so done that way.

--
John Naylor
EDB: http://www.enterprisedb.com


v24-0001-Add-fast-path-for-validating-UTF-8-text.patch
Description: Binary data


Re: CREATE ROLE IF NOT EXISTS

2021-10-19 Thread Isaac Morland
On Tue, 19 Oct 2021 at 16:12, David Christensen <
david.christen...@crunchydata.com> wrote:

> Greetings -hackers,
>
> Enclosed is a patch that implements CREATE ROLE IF NOT EXISTS (along with
> the same support for USER/GROUP).  This is a fairly straightforward
> approach in that we do no validation of anything other than existence, with
> the user needing to ensure that permissions/grants are set up in the proper
> way.
>

One little tricky aspect that occurs to me is the ALTER ROLE to set the
role flag options: it really needs to mention *all* the available options
if it is to leave the role in a specific state regardless of how it started
out. For example, if the existing role has BYPASSRLS but you want the
default NOBYPASSRLS you have to say so explicitly.

Because of this, I think my preference, based just on thinking about
setting the flag options, would be for CREATE OR REPLACE.

However, I'm wondering about the role name options: IN ROLE, ROLE, ADMIN.
With OR REPLACE should they replace the set of memberships or augment it?
Either seems potentially problematic to me. By contrast it’s absolutely
clear what IF NOT EXISTS should do with these.

So I’m not sure what I think overall.


Re: Refactoring pg_dump's getTables()

2021-10-19 Thread Tom Lane
Daniel Gustafsson  writes:
> On 17 Oct 2021, at 22:05, Alvaro Herrera  wrote:
>> Maybe I would group together the changes that all require the same version
>> test, rather than keeping the output columns in the same order.

> I agree with that, if we're doing all this we might as well go all the way for
> readability.

I had a go at doing that, but soon decided that it wasn't as great an
idea as it first seemed.  There are two problems:

1. It's not clear what to do with fields where there are three or more
variants, such as reloptions and checkoptions.

2. Any time we modify the behavior for a particular field, we'd
have to merge or un-merge it from the stanza for the
previously-most-recently-relevant version.  This seems like it'd
be a maintenance headache and make patch footprints bigger than they
needed to be.

So I ended up not doing very much merging.  I did make an effort
to group the fields in perhaps a slightly more logical order
than before.

regards, tom lane




Re: Extending amcheck to check toast size and compression

2021-10-19 Thread Greg Stark
Right so here's a review.

I think the patch is committable as is. It's an improvement and it
does the job as promised. I do have some comments but I don't think
they're serious issues and would actually be pretty happy committing
it as is. Fwiw I didn't realize how short the patch was at first and
it probably doesn't need yet another review.


/* Toasted attributes too large to be untoasted should never be stored */
if (toast_pointer.va_rawsize > VARLENA_SIZE_LIMIT)

1) I know this used to say MaxAlloc -- personally I would probably go
with that but either is fine. But if you're going to have a separate
constant there could be more of a comment explaining why that's the
maximum -- probably with a pointer to MaxAlloc and postgres.h's
VARSIZE macros.

The switch statement at line 1443 seems a bit ... baroque. Is it
clearer than a simple "if cmid != TOAST_PGLZ_COMPRESSION_ID && cmid !=
TOAST_LZ4_COMPRESSION_ID)" ? I mean, I see this is easier to add more
cases to but I found dealing with a case that falls through and no
default is a lot of cognitive overhead to understand what is in the
end just effectively a simple branch.

Fwiw compilers aren't always the best at optimizing switch statements.
It's entirely possible a compiler may end up building a whole lookup
table of jumps for this thing. Not that it's performance critical but
...

But all that's more words than necessary for a minor style comment.


Fwiw I spent a few minutes thinking about and writing up this
suggestion and then only afterwards realized the code in question
wasn't from this patch. I'll mention it anyways but it's not relevant
to this patch review I guess :)

I found the whole expected_chunk_seq parameter thing a bit confusing
and less useful than possible. I would instead suggestion:

Allocate an array of the expected number of chunk numbers before
calling check_toast_tuple and then just gather the chunk_seq that are
returned. When it's finished you can do things like: a) Check if
they're all ascending and report index corruption if not. b) Check if
any numbers are missing and report which ones are missing and/or how
many. c) Check if there are duplicates and report that. These would
all be easier for a user to interpret than "index scan returned chunk
5 when expecting chunk 9".



On Tue, 4 May 2021 at 12:20, Mark Dilger  wrote:
>
> Hackers,
>
> During the version 14 development period, a few checks of toasted attributes 
> were written but never committed.  For the version 15 development cycle, I'd 
> like to consider extending the checks of toasted attributes.  First, no 
> toasted attribute should ever have a rawsize larger than the 1GB varlena 
> limit.  Second, no compressed toasted attribute should have an extsize 
> indicating that the toast expanded during toasting.  Such a extsize could 
> mean the compression code is malfunctioning, or that the extsize or rawsize 
> fields are corrupt.  Third, any compressed attribute should have a valid 
> compression method ID.
>
> These checks are cheap.  Actually retrieving the compressed toasted data and 
> checking that it uncompresses correctly would have very different performance 
> implications, but that is not included in this patch.
>
>
>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>


-- 
greg

On Wed, 14 Jul 2021 at 10:58, Mark Dilger  wrote:
>
>
>
> > On Jul 14, 2021, at 3:33 AM, Heikki Linnakangas  wrote:
> >
> >> +/* The largest valid toast va_rawsize */
> >> +#define VARLENA_SIZE_LIMIT 0x3FFF
> >> +
> >
> > Hmm, a toasted datum cannot be larger than MaxAllocSize, because it's 
> > reconstituted in a palloc'd datum, right?
>
> No datum size exceeds MaxAllocSize, and no datum expands when compressed 
> (because for those that do expand under any particular compression algorithm, 
> we opt to instead store the datum uncompressed), so no valid toast pointer 
> should contain a va_rawsize field greater than MaxAllocSize.  Any toast 
> pointers that have larger va_rawsize fields are therefore corrupt.
>
> VARLENA_SIZE_LIMIT is defined here equal to MaxAllocSize:
>
> src/include/utils/memutils.h:#define MaxAllocSize   ((Size) 
> 0x3fff) /* 1 gigabyte - 1 */
>
> Earlier versions of the patch used MaxAllocSize rather than defining 
> VARLENA_SIZE_LIMIT, but review comments suggested that was less clear.
>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
>
>


-- 
greg




Re: ALTER INDEX .. RENAME allows to rename tables/views as well

2021-10-19 Thread Bossart, Nathan
On 10/19/21, 1:36 PM, "Alvaro Herrera"  wrote:
> I was about to push this when it occurred to me that it seems a bit
> pointless to release AEL in order to retry with the lighter lock; once
> we have AEL, let's just keep it and proceed.  So how about the attached?

I did consider this, but I figured it might be better to keep the lock
level consistent for a given object type no matter what the statement
type is.  I don't have a strong opinion about this, though.

> I'm now thinking that this is to back-patch all the way to 12.

+1.  The patch LGTM.  I like the test additions to check the lock
level.

Nathan



Re: ThisTimeLineID can be used uninitialized

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-19 15:13:04 -0400, Robert Haas wrote:
> This is a followup to
> http://postgr.es/m/ca+tgmoz5a26c6oxkapafyuy_sx0vg6vxdd_q6asezsvrphd...@mail.gmail.com.
> I'm suspicious of the following code in CreateReplicationSlot:
> 
> /* setup state for WalSndSegmentOpen */
> sendTimeLineIsHistoric = false;
> sendTimeLine = ThisTimeLineID;
> 
> The first thing that's odd about this is that if this is physical
> replication, it's apparently dead code, because AFAICT sendTimeLine
> will not be used for anything in that case.

It's quite confusing. It's *really* not helped by physical replication using
but not really using an xlogreader to keep state. Which presumably isn't
actually used during a physical CreateReplicationSlot(), but is referenced by
a comment :/


> But I don't know if it matters. We call CreateInitDecodingContext()
> with sendTimeLine and ThisTimeLineID still zero; it doesn't call any
> callbacks. Then we call DecodingContextFindStartpoint() with
> sendTimeLine still 0 and the first callback that gets invoked is
> logical_read_xlog_page. At this point sendTimeLine = 0 and
> ThisTimeLineID = 0. That calls XLogReadDetermineTimeline() which
> resets ThisTimeLineID to the correct value of 2, but when we get back
> to logical_read_xlog_page, we still manage to call WALRead with a
> timeline of 0 because state->seg.ws_tli is still 0. And when WALRead
> eventually does call WalSndOpen, which unconditionally propagates
> sendTimeLine into the TLI pointer that is passed to it. So now
> state->seg_ws_tli also ends up being 2. So I guess maybe nothing bad
> happens? But it sure seems strange that the code would apparently work
> just as well as it does today with the following patch:
> 
> diff --git a/src/backend/replication/walsender.c
> b/src/backend/replication/walsender.c
> index b811a5c0ef..44fd598519 100644
> --- a/src/backend/replication/walsender.c
> +++ b/src/backend/replication/walsender.c
> @@ -945,7 +945,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
> 
>  /* setup state for WalSndSegmentOpen */
>  sendTimeLineIsHistoric = false;
> -sendTimeLine = ThisTimeLineID;
> +sendTimeLine = rand() % 10;
> 
>  if (cmd->kind == REPLICATION_KIND_PHYSICAL)
>  {

Istm we should introduce an InvalidTimeLineID, and explicitly initialize
sendTimeLine to that, and assert that it's valid / invalid in a bunch of
places?

Greetings,

Andres Freund




Re: ALTER INDEX .. RENAME allows to rename tables/views as well

2021-10-19 Thread Alvaro Herrera
I was about to push this when it occurred to me that it seems a bit
pointless to release AEL in order to retry with the lighter lock; once
we have AEL, let's just keep it and proceed.  So how about the attached?

I'm now thinking that this is to back-patch all the way to 12.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
>From 04334bc719efe69f63c8a590b9c682a22c6fb413 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 19 Oct 2021 17:02:17 -0300
Subject: [PATCH v3] Ensure correct lock level is used in ALTER ... RENAME

Commit 1b5d797cd4f7 intended to relax the lock level used to rename
indexes, but inadvertently allowed *any* relation to be renamed with a
lowered lock level, as long as the command is spelled ALTER INDEX.
That's undesirable for other relation types, so retry the operation with
the higher lock if the relation turns out not to be an index.

After this fix, ALTER INDEX  RENAME will require access
exclusive lock.

Author: Nathan Bossart 
Reported-by: Onder Kalaci 
Discussion: https://postgr.es/m/ph0pr21mb1328189e2821cdec646f8178d8...@ph0pr21mb1328.namprd21.prod.outlook.com
---
 src/backend/commands/tablecmds.c  | 60 +--
 src/test/regress/expected/alter_table.out | 39 +++
 src/test/regress/sql/alter_table.sql  | 25 ++
 3 files changed, 110 insertions(+), 14 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 487852a14e..f02399f699 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3735,7 +3735,7 @@ RenameConstraint(RenameStmt *stmt)
 ObjectAddress
 RenameRelation(RenameStmt *stmt)
 {
-	bool		is_index = stmt->renameType == OBJECT_INDEX;
+	bool		is_index_stmt = stmt->renameType == OBJECT_INDEX;
 	Oid			relid;
 	ObjectAddress address;
 
@@ -3745,24 +3745,45 @@ RenameRelation(RenameStmt *stmt)
 	 * end of transaction.
 	 *
 	 * Lock level used here should match RenameRelationInternal, to avoid lock
-	 * escalation.
+	 * escalation.  However, because ALTER INDEX can be used with any relation
+	 * type, we mustn't believe without verification.
 	 */
-	relid = RangeVarGetRelidExtended(stmt->relation,
-	 is_index ? ShareUpdateExclusiveLock : AccessExclusiveLock,
-	 stmt->missing_ok ? RVR_MISSING_OK : 0,
-	 RangeVarCallbackForAlterRelation,
-	 (void *) stmt);
-
-	if (!OidIsValid(relid))
+	for (;;)
 	{
-		ereport(NOTICE,
-(errmsg("relation \"%s\" does not exist, skipping",
-		stmt->relation->relname)));
-		return InvalidObjectAddress;
+		LOCKMODE	lockmode;
+		bool		obj_is_index;
+
+		lockmode = is_index_stmt ? ShareUpdateExclusiveLock : AccessExclusiveLock;
+
+		relid = RangeVarGetRelidExtended(stmt->relation, lockmode,
+		 stmt->missing_ok ? RVR_MISSING_OK : 0,
+		 RangeVarCallbackForAlterRelation,
+		 (void *) stmt);
+
+		if (!OidIsValid(relid))
+		{
+			ereport(NOTICE,
+	(errmsg("relation \"%s\" does not exist, skipping",
+			stmt->relation->relname)));
+			return InvalidObjectAddress;
+		}
+
+		/*
+		 * We allow mismatched statement and object types (e.g., ALTER INDEX
+		 * to rename a table), but we might've used the wrong lock level.  If
+		 * that happens, retry with the correct lock level.  We don't bother
+		 * if we already acquired AccessExclusiveLock with an index, however.
+		 */
+		obj_is_index = (get_rel_relkind(relid) == RELKIND_INDEX);
+		if (obj_is_index || is_index_stmt == obj_is_index)
+			break;
+
+		UnlockRelationOid(relid, lockmode);
+		is_index_stmt = obj_is_index;
 	}
 
 	/* Do the work */
-	RenameRelationInternal(relid, stmt->newname, false, is_index);
+	RenameRelationInternal(relid, stmt->newname, false, is_index_stmt);
 
 	ObjectAddressSet(address, RelationRelationId, relid);
 
@@ -3810,6 +3831,17 @@ RenameRelationInternal(Oid myrelid, const char *newrelname, bool is_internal, bo
  errmsg("relation \"%s\" already exists",
 		newrelname)));
 
+	/*
+	 * RenameRelation is careful not to believe the caller's idea of the
+	 * relation kind being handled.  We don't have to worry about this (at
+	 * least not until ALTER TABLE .. ADD CONSTRAINT ... USING INDEX is
+	 * implemented for partitioned tables), but let's not be totally oblivious
+	 * to it.  We can process an index as not-an-index, but not the other way
+	 * around.
+	 */
+	Assert(!is_index ||
+		   is_index == (targetrelation->rd_rel->relkind == RELKIND_INDEX));
+
 	/*
 	 * Update pg_class tuple with new relname.  (Scribbling on reltup is OK
 	 * because it's a copy...)
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index fa54bef89a..a3a18b23bf 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -232,6 +232,45 @@ SET ROLE regress_alter_table_user1;
 ALTER INDEX onek_unique1 RENAME TO fail;  -- permission denied
 ERROR:  must be owner of index onek_unique1
 RESET 

Re: pg_upgrade test chatter

2021-10-19 Thread Bossart, Nathan
On 10/19/21, 12:37 PM, "Tom Lane"  wrote:
> Actually ... why shouldn't we suppress that by running the command
> with client_min_messages = warning?  This would have to be a change
> to pg_regress, but I'm having a hard time thinking of cases where
> quieting that message would be a problem.

I was just looking into something like this.

> We could dodge that, with modern versions of psql, by issuing
> two -c switches.  So after a bit of hacking I have the attached
> POC patch.  It's incomplete because now that we have this
> infrastructure we should change other parts of pg_regress
> to not launch psql N times where one would do.  But it's enough
> to get through check-world without any chatter.
> 
> Any objections to polishing this up and pushing it?

No objections here.  This seems like an overall improvement, and I
confirmed that it clears up the NOTICE from the pg_upgrade test.

Nathan



Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-19 Thread Mark Dilger



> On Oct 19, 2021, at 12:28 PM, Jeff Davis  wrote:
> 
> On Mon, 2021-09-27 at 11:15 -0700, Mark Dilger wrote:
>> Allow non-superusers to create event triggers.  The logic already
>> existed and is unchanged to handle event triggers owned by non-
>> superusers and conditioning those triggers firing on the (trigger-
>> owner, role-performing-event) pair.  The v7 patch set didn't go quite
>> so far as allowing non-superusers to create event triggers, but that
>> undercuts much of the benefit of the changes for no obvious purpose.
> 
> The thread on role self-administration seems like a dependency here.
> And it doesn't look like there's consensus that we should be
> conditioning event trigger firing on role membership:
> 
> https://postgr.es/m/20211005043438.gb314...@rfd.leadboat.com

I have noticed the lack of consensus.  The resistance to having roles own other 
roles should get more attention, I think.

Stephen and I went into the weeds on what DROP ROLE rolename CASCADE should 
mean, but I don't think that should hold up the idea of role ownership.  He 
wanted a different command to do the work rather than this command, but I don't 
see anything in what he wrote to suggest that the idea is unacceptable, only a 
different preference on how that functionality gets spelled.

There was also some difference in interpretation on what exact differences 
there are between "ownership" and "dependency".  To me, "ownership" is a 
subtype of dependency, just as "is indexed by" and "is contained in" are 
subtypes of dependency.  Indexes are dependent on the tables they index, tables 
are dependent on schemas that contain them, objects are dependent on roles that 
own them, and so forth.  Stephen seemed to have a different view.  I'm not at 
all clear on whether his different view is a showstopper.

Before redesigning the way we fix up event triggers for v15, I'd like to have a 
sense of how contentious all this is.  If it's just a matter of definitions and 
command spellings, we can work around it.

Thanks for participating in this thread, BTW.

> Instead, how about:
> 
> * make a predefined role pg_event_trigger that allows creating event
> triggers
> * make it an error for a superuser to fire an event trigger created by
> a non-superuser

I think blocking superuser actions is a non-starter, but you address that 
below

> It doesn't solve the problem hierarchically, but we don't solve other
> predefined role privileges hierarchically, either (and for many of them
> it makes no sense).
> 
> A downside is that the privileged event trigger creator could
> accidentally make life annoying for a superuser that's trying to issue
> DDL: the superuser would need to disable the event trigger, perform the
> action, then re-enable it. But that shouldn't be a practical problem in
> sane setups -- superusers shouldn't be performing a lot of DDL, and if
> they are, it's good to be explicit that they are bypassing something
> configured by their pseudo-admin.

I'd prefer not to assume much about the sanity of the setup, and I agree the 
superuser should be able to unconditionally disable the offending event 
trigger, but I think it is a pretty poor solution that a superuser would need 
to disable and then re-enable a trigger.  Other commands in other sessions 
would be able to sneak through during the window of time when the trigger is 
disabled.  Wouldn't it be much cleaner to have superuser bypass the trigger?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Postgres perl module namespace

2021-10-19 Thread Erik Rijkers

Op 19-10-2021 om 20:54 schreef Andrew Dunstan:




Discussion has gone quiet and the tree is now relatively quiet, so now
seems like a good time to do this. See attached patches.



> [0001-move-perl-test-modules-to-PostgreSQL-Test-namespace.patch ]
> [0002-move-PostgreSQL-Test-PostgresVersion-up-in-the-names.patch]


Those patches gave some complains about 
PostgreSQL/Test/PostgresVersion.pm being absent so I added this 
deletion.  I'm not sure that's correct but it enabled the build and 
check-world ran without errors.



Erik Rijkers
--- src/test/perl/Makefile.orig2	2021-10-19 21:40:38.388116778 +0200
+++ src/test/perl/Makefile	2021-10-19 21:40:52.208346619 +0200
@@ -23,13 +23,11 @@
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/SimpleTee.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/SimpleTee.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/RecursiveCopy.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/RecursiveCopy.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/Cluster.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm'
-	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/PostgresVersion.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/PostgresVersion.pm'
 
 uninstall:
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Utils.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/SimpleTee.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/RecursiveCopy.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm'
-	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/PostgresVersion.pm'
 
 endif


CREATE ROLE IF NOT EXISTS

2021-10-19 Thread David Christensen
Greetings -hackers,

Enclosed is a patch that implements CREATE ROLE IF NOT EXISTS (along with
the same support for USER/GROUP).  This is a fairly straightforward
approach in that we do no validation of anything other than existence, with
the user needing to ensure that permissions/grants are set up in the proper
way.

Comments?

Best,

David


CREATE-ROLE-IF-NOT-EXISTS.patch
Description: Binary data


Re: pg_upgrade test chatter

2021-10-19 Thread Tom Lane
I wrote:
> "Bossart, Nathan"  writes:
>> I run 'make check-world' a lot, and I typically use parallelism and
>> redirect stdout to /dev/null as suggested in the docs [0].  This seems
>> to eliminate all of the test chatter except for this one message:
>> NOTICE:  database "regression" does not exist, skipping

> Yeah, that's bugged me too ever since we got to the point where that
> was the only output ...

Actually ... why shouldn't we suppress that by running the command
with client_min_messages = warning?  This would have to be a change
to pg_regress, but I'm having a hard time thinking of cases where
quieting that message would be a problem.

I tried doing this as a one-liner change in pg_regress's
drop_database_if_exists(), but the idea fell over pretty
quickly, because what underlies that is a "psql -c" call:

$ psql -c 'set client_min_messages = warning; drop database if exists foo'
ERROR:  DROP DATABASE cannot run inside a transaction block

We could dodge that, with modern versions of psql, by issuing
two -c switches.  So after a bit of hacking I have the attached
POC patch.  It's incomplete because now that we have this
infrastructure we should change other parts of pg_regress
to not launch psql N times where one would do.  But it's enough
to get through check-world without any chatter.

Any objections to polishing this up and pushing it?

regards, tom lane

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 05296f7ee1..cfadc0cd70 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -122,7 +122,9 @@ static void make_directory(const char *dir);
 
 static void header(const char *fmt,...) pg_attribute_printf(1, 2);
 static void status(const char *fmt,...) pg_attribute_printf(1, 2);
-static void psql_command(const char *database, const char *query,...) pg_attribute_printf(2, 3);
+static StringInfo psql_start_command(void);
+static void psql_add_command(StringInfo buf, const char *query,...) pg_attribute_printf(2, 3);
+static void psql_end_command(StringInfo buf, const char *database);
 
 /*
  * allow core files if possible.
@@ -1136,51 +1138,94 @@ config_sspi_auth(const char *pgdata, const char *superuser_name)
 #endif			/* ENABLE_SSPI */
 
 /*
- * Issue a command via psql, connecting to the specified database
+ * psql_start_command, psql_add_command, psql_end_command
+ *
+ * Issue one or more commands within one psql call.
+ * Set up with psql_start_command, then add commands one at a time
+ * with psql_add_command, and finally execute with psql_end_command.
  *
  * Since we use system(), this doesn't return until the operation finishes
  */
+static StringInfo
+psql_start_command(void)
+{
+	StringInfo	buf = makeStringInfo();
+
+	appendStringInfo(buf,
+	 "\"%s%spsql\" -X",
+	 bindir ? bindir : "",
+	 bindir ? "/" : "");
+	return buf;
+}
+
 static void
-psql_command(const char *database, const char *query,...)
+psql_add_command(StringInfo buf, const char *query,...)
 {
-	char		query_formatted[1024];
-	char		query_escaped[2048];
-	char		psql_cmd[MAXPGPATH + 2048];
-	va_list		args;
-	char	   *s;
-	char	   *d;
+	StringInfoData cmdbuf;
+	const char *cmdptr;
+
+	/* Add each command as a -c argument in the psql call */
+	appendStringInfoString(buf, " -c \"");
 
 	/* Generate the query with insertion of sprintf arguments */
-	va_start(args, query);
-	vsnprintf(query_formatted, sizeof(query_formatted), query, args);
-	va_end(args);
+	initStringInfo();
+	for (;;)
+	{
+		va_list		args;
+		int			needed;
+
+		va_start(args, query);
+		needed = appendStringInfoVA(, query, args);
+		va_end(args);
+		if (needed == 0)
+			break;/* success */
+		enlargeStringInfo(, needed);
+	}
 
 	/* Now escape any shell double-quote metacharacters */
-	d = query_escaped;
-	for (s = query_formatted; *s; s++)
+	for (cmdptr = cmdbuf.data; *cmdptr; cmdptr++)
 	{
-		if (strchr("\\\"$`", *s))
-			*d++ = '\\';
-		*d++ = *s;
+		if (strchr("\\\"$`", *cmdptr))
+			appendStringInfoChar(buf, '\\');
+		appendStringInfoChar(buf, *cmdptr);
 	}
-	*d = '\0';
 
-	/* And now we can build and execute the shell command */
-	snprintf(psql_cmd, sizeof(psql_cmd),
-			 "\"%s%spsql\" -X -c \"%s\" \"%s\"",
-			 bindir ? bindir : "",
-			 bindir ? "/" : "",
-			 query_escaped,
-			 database);
+	appendStringInfoChar(buf, '"');
+
+	pfree(cmdbuf.data);
+}
 
-	if (system(psql_cmd) != 0)
+static void
+psql_end_command(StringInfo buf, const char *database)
+{
+	/* Add the database name --- assume it needs no extra escaping */
+	appendStringInfo(buf,
+	 " \"%s\"",
+	 database);
+
+	/* And now we can execute the shell command */
+	if (system(buf->data) != 0)
 	{
 		/* psql probably already reported the error */
-		fprintf(stderr, _("command failed: %s\n"), psql_cmd);
+		fprintf(stderr, _("command failed: %s\n"), buf->data);
 		exit(2);
 	}
+
+	/* Clean up */
+	pfree(buf->data);
+	pfree(buf);
 }
 
+/*
+ * Shorthand macro for the common case of a single command

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-11 16:48:01 -0400, Melanie Plageman wrote:
> On Fri, Oct 8, 2021 at 1:56 PM Andres Freund  wrote:
> > On 2021-10-01 16:05:31 -0400, Melanie Plageman wrote:
> > > diff --git a/src/backend/utils/init/postinit.c 
> > > b/src/backend/utils/init/postinit.c
> > > index 78bc64671e..fba5864172 100644
> > > --- a/src/backend/utils/init/postinit.c
> > > +++ b/src/backend/utils/init/postinit.c
> > > @@ -670,8 +670,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const 
> > > char *username,
> > >   EnablePortalManager();
> > >
> > >   /* Initialize status reporting */
> > > - if (!bootstrap)
> > > - pgstat_beinit();
> > > + pgstat_beinit();
> > >
> > >   /*
> > >* Load relcache entries for the shared system catalogs.  This must 
> > > create
> > > --
> > > 2.27.0
> > >
> >
> > I think it's good to remove more and more of these !bootstrap cases - they
> > really make it harder to understand the state of the system at various
> > points. Optimizing for the rarely executed bootstrap mode at the cost of
> > checks in very common codepaths...
>
> What scope do you suggest for this patch set? A single patch which does
> this in more locations (remove !bootstrap) or should I remove this patch
> from the patchset?

I think the scope is fine as-is.


> > Is pgstattuple the best place for this helper? It's not really pgstatfuncs
> > specific...
> >
> > It also looks vaguely familiar - I wonder if we have a helper roughly like
> > this somewhere else already...
> >
>
> I don't see a function which is specifically a utility to make a
> tuplestore. Looking at the callers of tuplestore_begin_heap(), I notice
> very similar code to the function I added in pg_tablespace_databases()
> in utils/adt/misc.c, pg_stop_backup_v2() in xlogfuncs.c,
> pg_event_trigger_dropped_objects() and pg_event_trigger_ddl_commands in
> event_tigger.c, pg_available_extensions in extension.c, etc.
>
> Do you think it makes sense to refactor this code out of all of these
> places?

Yes, I think it'd make sense. We have about 40 copies of this stuff, which is
fairly ridiculous.


> If so, where would such a utility function belong?

Not quite sure. src/backend/utils/fmgr/funcapi.c maybe? I suggest starting a
separate thread about that...


> > > @@ -2999,6 +3036,14 @@ pgstat_shutdown_hook(int code, Datum arg)
> > >  {
> > >   Assert(!pgstat_is_shutdown);
> > >
> > > + /*
> > > +  * Only need to send stats on IO Ops for IO Paths when a process 
> > > exits, as
> > > +  * pg_stat_get_buffers() will read from live backends' 
> > > PgBackendStatus and
> > > +  * then sum this with totals from exited backends persisted by the 
> > > stats
> > > +  * collector.
> > > +  */
> > > + pgstat_send_buffers();
> > > +
> > >   /*
> > >* If we got as far as discovering our own database ID, we can 
> > > report what
> > >* we did to the collector.  Otherwise, we'd be sending an invalid
> > > @@ -3092,6 +3137,30 @@ pgstat_send(void *msg, int len)
> > >  #endif
> > >  }
> >
> > I think it might be nicer to move pgstat_beshutdown_hook() to be a
> > before_shmem_exit(), and do this in there.
> >
>
> Not really sure the correct way to do this. A cursory attempt to do so
> failed because ShutdownXLOG() is also registered as a
> before_shmem_exit() and ends up being called after
> pgstat_beshutdown_hook(). pgstat_beshutdown_hook() zeroes out
> PgBackendStatus, ShutdownXLOG() initiates a checkpoint, and during a
> checkpoint, the checkpointer increments IO op counter for writes in its
> PgBackendStatus.

I think we'll really need to do a proper redesign of the shutdown callback
mechanism :(.



> > > +static void
> > > +pgstat_recv_io_path_ops(PgStat_MsgIOPathOps *msg, int len)
> > > +{
> > > + int io_path;
> > > + PgStatIOOps *src_io_path_ops = msg->iop.io_path_ops;
> > > + PgStatIOOps *dest_io_path_ops =
> > > + globalStats.buffers.ops[msg->backend_type].io_path_ops;
> > > +
> > > + for (io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++)
> > > + {
> > > + PgStatIOOps *src = _io_path_ops[io_path];
> > > + PgStatIOOps *dest = _io_path_ops[io_path];
> > > +
> > > + dest->allocs += src->allocs;
> > > + dest->extends += src->extends;
> > > + dest->fsyncs += src->fsyncs;
> > > + dest->writes += src->writes;
> > > + }
> > > +}
> >
> > Could this, with a bit of finessing, use pgstat_add_io_path_ops()?
> >
>
> I didn't really see a good way to do this -- given that
> pgstat_add_io_path_ops() adds IOOps members to PgStatIOOps members --
> which requires a pg_atomic_read_u64() and pgstat_recv_io_path_ops adds
> PgStatIOOps to PgStatIOOps which doesn't require pg_atomic_read_u64().
> Maybe I could pass a flag which, based on the type, either does or
> doesn't use pg_atomic_read_u64 to access the value? But that seems worse
> to me.

Yea, you're probably right, 

Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-19 Thread Jeff Davis
On Mon, 2021-09-27 at 11:15 -0700, Mark Dilger wrote:
> Allow non-superusers to create event triggers.  The logic already
> existed and is unchanged to handle event triggers owned by non-
> superusers and conditioning those triggers firing on the (trigger-
> owner, role-performing-event) pair.  The v7 patch set didn't go quite
> so far as allowing non-superusers to create event triggers, but that
> undercuts much of the benefit of the changes for no obvious purpose.

The thread on role self-administration seems like a dependency here.
And it doesn't look like there's consensus that we should be
conditioning event trigger firing on role membership:

https://postgr.es/m/20211005043438.gb314...@rfd.leadboat.com

Instead, how about:

* make a predefined role pg_event_trigger that allows creating event
triggers
* make it an error for a superuser to fire an event trigger created by
a non-superuser

It doesn't solve the problem hierarchically, but we don't solve other
predefined role privileges hierarchically, either (and for many of them
it makes no sense).

A downside is that the privileged event trigger creator could
accidentally make life annoying for a superuser that's trying to issue
DDL: the superuser would need to disable the event trigger, perform the
action, then re-enable it. But that shouldn't be a practical problem in
sane setups -- superusers shouldn't be performing a lot of DDL, and if
they are, it's good to be explicit that they are bypassing something
configured by their pseudo-admin.

Regards,
Jeff Davis






cursor use vs pg_stat_statements

2021-10-19 Thread Andrew Dunstan


The problem I'm writing about (h/t Simon Riggs for finding it) is
illustrated by the following snippet of java:

  public static void runtest(Connection conn) throws Exception {
Statement stmt = conn.createStatement();
stmt.setFetchSize(10);
ResultSet rs = stmt.executeQuery("select oid, relfileid, relname from 
pg_class");
int count = 100;
while (rs.next() && count-- > 0) {
  System.out.print(".");
}
rs.close();
stmt.commit();
stmt.close();
System.out.println("");
  }

When called, this prints out a line with 100 dots showing 100 lines were
fetched, but pg_stat_statements shows this:

query | select oid, relfilenode, relname from pg_class
calls | 1
rows  | 10


suggesting only 10 rows were returned. It appears that only the first
"EXECUTE 10" command against the portal is counted. At the very least
this is a POLA violation, and it seems to be a bug. Maybe it's
documented somewhere but if so it's not obvious to me.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [RFC] building postgres with meson

2021-10-19 Thread Tom Lane
Andres Freund  writes:
> On 2021-10-12 01:37:21 -0700, Andres Freund wrote:
>> As far as I can tell the only OS that postgres currently supports that
>> meson doesn't support is HPUX. It'd likely be fairly easy to add
>> gcc-on-hpux support, a chunk more to add support for the proprietary
>> ones.

> Tom, wrt HPUX on pa-risc, what are your thoughts there? IIRC we gave up
> supporting HP's compiler on pa-risc a while ago.

Right.  I am still testing with gcc on HP-PA.  I'd kind of like to
keep it running just as an edge case for our spinlock support, but
I'm not sure that I want to do any huge amount of work on meson
to keep that going.

I do have a functioning OpenBSD installation on that machine, so
one alternative if the porting costs look too high is to replace
gaur with an OpenBSD animal.  However, last I checked, OpenBSD
was about half the speed of HPUX on that hardware, so I'm not
real eager to go that way.  gaur's already about the slowest
animal in the farm :-(

> As I said it'd probably not be too hard to add meson support for hpux on hppa,
> it's probably just a few branches. But that'd require access somewhere. The
> gcc compile farm does not have a hppa member anymore...

If you've got an idea where to look, I could add that to my
to-do queue.

In any case, I don't think we need to consider HPUX as a blocker
for the meson approach.  The value-add from keeping gaur going
probably isn't terribly much.  I'm more concerned about the
effort involved in getting meson going on some other old animals,
such as prairiedog.

regards, tom lane




ThisTimeLineID can be used uninitialized

2021-10-19 Thread Robert Haas
Hi,

This is a followup to
http://postgr.es/m/ca+tgmoz5a26c6oxkapafyuy_sx0vg6vxdd_q6asezsvrphd...@mail.gmail.com.
I'm suspicious of the following code in CreateReplicationSlot:

/* setup state for WalSndSegmentOpen */
sendTimeLineIsHistoric = false;
sendTimeLine = ThisTimeLineID;

The first thing that's odd about this is that if this is physical
replication, it's apparently dead code, because AFAICT sendTimeLine
will not be used for anything in that case. If it's logical
replication, there's a case where this will set sendTImeLine to 0,
which seems very strange, since that is not a valid timeline. To
reproduce, do this:

1. Create a new primary database with tli=1. Create a standby.

2. On the standby, fire up a database-connected replication, something
like this:
psql 'port=5433 replication=database dbname=rhaas'
Don't execute any commands yet!

2. From some other backend, promote the standby:
select pg_promote();
It now gets a TLI of 2.

3. Try to create a logical replication slot perhaps using something like this:
CREATE_REPLICATION_SLOT "foo" LOGICAL "test_decoding" ( SNAPSHOT 'nothing');

If the system had been in normal running when you started the session,
it would be initialized, because InitPostgres() calls
RecoveryInProgress(). But since that only initializes it during normal
running, and not during recovery, it doesn't help in this scenario.
And if on the other hand you had not promoted the standby as in step
2, then we'd still set sendTimeLine = 0 here, but then we'd almost
immediately call CheckLogicalDecodingRequirements() and error out
without doing anything with the value. Here, however, we continue on.

But I don't know if it matters. We call CreateInitDecodingContext()
with sendTimeLine and ThisTimeLineID still zero; it doesn't call any
callbacks. Then we call DecodingContextFindStartpoint() with
sendTimeLine still 0 and the first callback that gets invoked is
logical_read_xlog_page. At this point sendTimeLine = 0 and
ThisTimeLineID = 0. That calls XLogReadDetermineTimeline() which
resets ThisTimeLineID to the correct value of 2, but when we get back
to logical_read_xlog_page, we still manage to call WALRead with a
timeline of 0 because state->seg.ws_tli is still 0. And when WALRead
eventually does call WalSndOpen, which unconditionally propagates
sendTimeLine into the TLI pointer that is passed to it. So now
state->seg_ws_tli also ends up being 2. So I guess maybe nothing bad
happens? But it sure seems strange that the code would apparently work
just as well as it does today with the following patch:

diff --git a/src/backend/replication/walsender.c
b/src/backend/replication/walsender.c
index b811a5c0ef..44fd598519 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -945,7 +945,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)

 /* setup state for WalSndSegmentOpen */
 sendTimeLineIsHistoric = false;
-sendTimeLine = ThisTimeLineID;
+sendTimeLine = rand() % 10;

 if (cmd->kind == REPLICATION_KIND_PHYSICAL)
 {

And in fact, that passes make check-world. :-(

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [Proposal] Global temporary tables

2021-10-19 Thread Andrew Bille
Another thanks for the fix. It works for me.

But I found another crash!

On master with the v56 patches applied:

initdb -D data
pg_ctl -w -t 5 -D data -l server.log start
echo "create global temp table t(i int4); insert into t values (1); vacuum
t;" > tmp.sql
psql < tmp.sql

CREATE TABLE
INSERT 0 1
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost

with following stack:
[New LWP 2192409]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: andrew regression [local] VACUUM
 '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7fb26b558859 in __GI_abort () at abort.c:79
#2  0x5627ddd8466c in ExceptionalCondition
(conditionName=conditionName@entry=0x5627dde153d0
"TransactionIdIsNormal(relfrozenxid)", errorType=errorType@entry=0x5627ddde100b
"FailedAssertion", fileName=fileName@entry=0x5627dddfa697 "vacuum.c",
lineNumber=lineNumber@entry=1170) at assert.c:69
#3  0x5627dda70808 in vacuum_xid_failsafe_check
(relfrozenxid=, relminmxid=) at vacuum.c:1170
#4  0x5627dd8db7ee in lazy_check_wraparound_failsafe
(vacrel=vacrel@entry=0x5627df5c9680) at vacuumlazy.c:2607
#5  0x5627dd8ded18 in lazy_scan_heap (vacrel=vacrel@entry=0x5627df5c9680,
params=params@entry=0x7fffb3d36100, aggressive=aggressive@entry=true) at
vacuumlazy.c:978
#6  0x5627dd8e019a in heap_vacuum_rel (rel=0x7fb26218af70,
params=0x7fffb3d36100, bstrategy=) at vacuumlazy.c:644
#7  0x5627dda70033 in table_relation_vacuum (bstrategy=,
params=0x7fffb3d36100, rel=0x7fb26218af70) at
../../../src/include/access/tableam.h:1678
#8  vacuum_rel (relid=16385, relation=,
params=params@entry=0x7fffb3d36100)
at vacuum.c:2124
#9  0x5627dda71624 in vacuum (relations=0x5627df610598,
params=params@entry=0x7fffb3d36100, bstrategy=,
bstrategy@entry=0x0, isTopLevel=isTopLevel@entry=true) at vacuum.c:476
#10 0x5627dda71eb1 in ExecVacuum (pstate=pstate@entry=0x5627df567440,
vacstmt=vacstmt@entry=0x5627df545e70, isTopLevel=isTopLevel@entry=true) at
vacuum.c:269
#11 0x5627ddc4a8cc in standard_ProcessUtility (pstmt=0x5627df5461c0,
queryString=0x5627df545380 "vacuum t;", readOnlyTree=,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x5627df5462b0, qc=0x7fffb3d36470) at utility.c:858
#12 0x5627ddc4ada1 in ProcessUtility (pstmt=pstmt@entry=0x5627df5461c0,
queryString=, readOnlyTree=,
context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=,
queryEnv=, dest=0x5627df5462b0, qc=0x7fffb3d36470) at
utility.c:527
#13 0x5627ddc4822d in PortalRunUtility (portal=portal@entry=0x5627df5a67e0,
pstmt=pstmt@entry=0x5627df5461c0, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x5627df5462b0,
qc=qc@entry=0x7fffb3d36470) at pquery.c:1155
#14 0x5627ddc48551 in PortalRunMulti (portal=portal@entry=0x5627df5a67e0,
isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x5627df5462b0, altdest=altdest@entry=0x5627df5462b0,
qc=qc@entry=0x7fffb3d36470) at pquery.c:1312
#15 0x5627ddc4896c in PortalRun (portal=portal@entry=0x5627df5a67e0,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x5627df5462b0,
altdest=altdest@entry=0x5627df5462b0, qc=0x7fffb3d36470) at pquery.c:788
#16 0x5627ddc44afb in exec_simple_query
(query_string=query_string@entry=0x5627df545380
"vacuum t;") at postgres.c:1214
#17 0x5627ddc469df in PostgresMain (dbname=,
username=) at postgres.c:4497
#18 0x5627ddb9fe7d in BackendRun (port=port@entry=0x5627df566580) at
postmaster.c:4560
#19 0x5627ddba3001 in BackendStartup (port=port@entry=0x5627df566580)
at postmaster.c:4288
#20 0x5627ddba3248 in ServerLoop () at postmaster.c:1801
#21 0x5627ddba482a in PostmasterMain (argc=3, argv=) at
postmaster.c:1473
#22 0x5627ddae4d1d in main (argc=3, argv=0x5627df53f750) at main.c:198

On Mon, Oct 18, 2021 at 7:00 PM wenjing  wrote:

> Hi Andrew
> I fixed the problem, please confirm again.
> Thanks
>
> Wenjing
>


Re: XTS cipher mode for cluster file encryption

2021-10-19 Thread Stephen Frost
Greetings,

* Sasasu (i...@sasa.su) wrote:
> On 2021/10/19 00:37, Robert Haas wrote:
> >I think what we ought to be doing at
> >this point is combining our efforts to try to get some things
> >committed which make future work in this area committed - like that
> >patch to preserve relfilenode and database OID, or maybe some patches
> >to drive all of our I/O through a smaller number of code paths instead
> >of having every different type of temporary file we write reinvent the
> >wheel.
> 
> A unified block-based I/O API sounds great. Has anyone tried to do this
> before? It would be nice if the front-end tools could also use these API.

The TDE patch from Cybertec did go down this route, but the API ended up
being rather different which menat a lot of changes in other parts of
the system.  If we can get a block-based temporary file method that
maintains more-or-less the same API, that'd be great, but I'm not sure
that we can really do so and I am not entirely convinced that we should
make the TDE effort depend on an otherwise quite independent effort of
making all temp files usage be block based.

> As there are so many threat models, I propose to do the TDE feature by a set
> of hooks. those hooks are on the critical path of IO operation, add the
> ability to let extension replace the IO API. and also load extension when
> initdb, single-mode, and in front-end tools.
> This sounds Like using $LD_PRELOAD to replace pread(2) and pwrite(2), which
> widely used in folder based encryption. but the hook will pass more context
> (filenode, tableoid, blocksize, and many) to the under layer, this hook API
> will look like object_access_hook.
> then implement the simplest AES-XTS. and put it to contrib. provide a tool
> to deactivate AES-XTS to make PostgreSQL upgradeable.
> 
> I think this is the most peaceful method. GCM people will not reject this
> just because XTS. and XTS people will satisfied(maybe?) with the complexity.
> for performance, just one more long-jump compare with current AES-XTS code.

I agree with Robert- using hooks for this really isn't realistic.  Where
would you store the tag for GCM without changes in core, for starters..?
Certainly wouldn't make sense to provide GCM only to throw the tag away.
Even between XTS and GCM, to say nothing of other possible methods,
there's going to be some serious differences that a single hook-based
API wouldn't be able to address.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: parallelizing the archiver

2021-10-19 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> Backwards compatibility is definitely a must, I'd say. Regardless of
> exactly how the backwards-compatible pugin is shipped, it should be what's
> turned on by default.

I keep seeing this thrown around and I don't quite get why we feel this
is the case.  I'm not completely against trying to maintain backwards
compatibility, but at the same time, we just went through changing quite
a bit around in v12 with the restore command and that's the other half
of this.  Why are we so concerned about backwards compatibility here
when there was hardly any complaint raised about breaking it in the
restore case?

If maintaining compatibility makes this a lot more difficult or ugly,
then I'm against doing so.  I don't know that to be the case, none of
the proposed approaches really sound all that bad to me, but I certainly
don't think we should be entirely avoiding the idea of breaking
backwards compatibility here.  We literally just did that and while
there's been some noise about it, it's hardly risen to the level of
being "something we should never, ever, even consider doing again" as
seems to be implied on this thread.

For those who might argue that maintaining compatibility for archive
command is somehow more important than for restore command- allow me to
save you the trouble and just let you know that I don't buy off on such
an argument.  If anything, it should be the opposite.  You back up your
database all the time and you're likely to see much more quickly if that
stops working.  Database restores, on the other hand, are nearly always
done in times of great stress and when you want things to be very clear
and easy to follow and for everything to 'just work'.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: XTS cipher mode for cluster file encryption

2021-10-19 Thread Stephen Frost
Greetings,

* Tomas Vondra (tomas.von...@enterprisedb.com) wrote:
> On 10/18/21 17:56, Stephen Frost wrote:
> >* Tomas Vondra (tomas.von...@enterprisedb.com) wrote:
> >>On 10/15/21 21:22, Stephen Frost wrote:
> >>>Now, to address the concern around re-encrypting a block with the same
> >>>key+IV but different data and leaking what parts of the page changed, I
> >>>do think we should use the LSN and have it change regularly (including
> >>>unlogged tables) but that's just because it's relatively easy for us to
> >>>do and means an attacker wouldn't be able to tell what part of the page
> >>>changed when the LSN was also changed.  That was also recommended by
> >>>NIST and that's a pretty strong endorsement also.
> >>
> >>Not sure - it seems a bit weird to force LSN change even in cases that don't
> >>generate any WAL. I was not following the encryption thread and maybe it was
> >>discussed/rejected there, but I've always imagined we'd have a global nonce
> >>generator (similar to a sequence) and we'd store it at the end of each
> >>block, or something like that.
> >
> >The 'LSN' being referred to here isn't the regular LSN that is
> >associated with the WAL but rather the separate FakeLSN counter which we
> >already have.  I wasn't suggesting having the regular LSN change in
> >cases that don't generate WAL.
> 
> I'm not very familiar with FakeLSN, but isn't that just about unlogged
> tables? How does that help cases like setting hint bits, which may not
> generate WAL?

Errr, there seems to have been some confusion in this thread between the
different ideas being tossed around.

The point of using FakeLSN for unlogged tables consistently is to
provide variability in the value used as the IV.  I wasn't suggesting to
use FakeLSN to provide a uniqueness guarantee- the reason we're talking
about using XTS is specifically because, unlike CTR, unique IVs are not
required.  Further, we don't need to find any additional space on the
page if we use XTS, meaning we can do things like go from an unencrypted
to an encrypted system with a basebackup+physical replication.

> >* Robert Haas (robertmh...@gmail.com) wrote:
> >>On Fri, Oct 15, 2021 at 3:22 PM Stephen Frost  wrote:
> >>>Specifically: The default cipher for LUKS is nowadays aes-xts-plain64
> >>>
> >>>and then this:
> >>>
> >>>https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMCrypt
> >>>
> >>>where plain64 is defined as:
> >>>
> >>>plain64: the initial vector is the 64-bit little-endian version of the
> >>>sector number, padded with zeros if necessary
> >>>
> >>>That is, the default for LUKS is AES, XTS, with a simple IV.  That
> >>>strikes me as a pretty ringing endorsement.
> >>
> >>Yes, that sounds promising. It might not hurt to check for other
> >>precedents as well, but that seems like a pretty good one.
> >>
> >>I'm not very convinced that using the LSN for any of this is a good
> >>idea. Something that changes most of the time but not all the time
> >>seems more like it could hurt by masking fuzzy thinking more than it
> >>helps anything.
> >
> >This argument doesn't come across as very strong at all to me,
> >particularly when we have explicit recommendations from NIST that having
> >the IV vary more is beneficial.  While this would be using the LSN, the
> >fact that the LSN changes most of the time but not all of the time isn't
> >new and is something we already have to deal with.  I'd think we'd
> >address the concern about mis-thinking around how this works by
> >providing a README and/or an appropriate set of comments around what's
> >being done and why.
> 
> I don't think anyone objects to varying IV more, as recommended by NIST.

Great.

> AFAICS the issue at hand is exactly the opposite - maybe not varying it
> enough, in some cases. It might be enough for MVCC purposes yet it might
> result in fatal failure of the encryption scheme. That's my concern, at
> least, and I assume it's what Robert meant by "fuzzy thinking" too.

XTS does not require the IV to be unique for every invocation, and
indeed other systems like dm-crypt don't use a unique IV for XTS.  We
really can't divorce the encryption methodology from the parameters that
are being used.  CTR and GCM are the methods that require a unique IV
(or, at least, a very very very likely unique one if you can't actually
implement a proper counter) but that's *not* what we're talking about
here.  The methods being discussed are XTS and GCM-SIV, the former
explicitly doesn't require a unique IV and the latter is specifically
designed to reduce the impact of an IV being re-used.  Both are, as NIST
points out, better off with a varying IV, but having the IV be reused
from time to time in either XTS or GCM-SIV does not result in a fatal
failure of the encryption scheme.

> FWIW I think we seem to be mixing nonces, IVs and tweak values. Although
> various encryption schemes place different requirements on those anyway.

The differences between those are pretty subtle and it gets a bit
confusing 

Re: [RFC] building postgres with meson

2021-10-19 Thread Andres Freund
Hi Tom,

On 2021-10-12 01:37:21 -0700, Andres Freund wrote:
> As far as I can tell the only OS that postgres currently supports that
> meson doesn't support is HPUX. It'd likely be fairly easy to add
> gcc-on-hpux support, a chunk more to add support for the proprietary
> ones.

Tom, wrt HPUX on pa-risc, what are your thoughts there? IIRC we gave up
supporting HP's compiler on pa-risc a while ago.

As I said it'd probably not be too hard to add meson support for hpux on hppa,
it's probably just a few branches. But that'd require access somewhere. The
gcc compile farm does not have a hppa member anymore...

I did notice that gcc will declare hppa-hpux obsolete in gcc 12 and will
remove at some point:
"The hppa[12]*-*-hpux10* and hppa[12]*-*-hpux11* configurations targeting 
32-bit PA-RISC with HP-UX have been obsoleted and will be removed in a future 
release."
https://gcc.gnu.org/gcc-12/changes.html

Greetings,

Andres Freund




Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-10-19 Thread Jacob Champion
On Tue, 2021-10-19 at 20:21 +0200, Daniel Gustafsson wrote:
> > On 15 Oct 2021, at 14:01, Daniel Gustafsson  wrote:
> > The attached contains the requested fixes, and has been tested on all (by 
> > us)
> > supported versions of OpenSSL.  Unless there are objections I will apply 
> > this
> > to master shortly.
> 
> Which is now done.

Thanks very much! Hopefully this makes that area of the code easier on
everyone.

--Jacob


Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-10-19 Thread Daniel Gustafsson
> On 15 Oct 2021, at 14:01, Daniel Gustafsson  wrote:

> The attached contains the requested fixes, and has been tested on all (by us)
> supported versions of OpenSSL.  Unless there are objections I will apply this
> to master shortly.

Which is now done.

--
Daniel Gustafsson   https://vmware.com/





Re: ldap/t/001_auth.pl fails with openldap 2.5

2021-10-19 Thread Andres Freund
Hi,

On 2021-10-10 00:45:41 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Although I'm mildly tempted to rewrap the parameters, it's kinda odd how the
> > trailing parameter on one line, has its value on the next line.
> 
> I'm betting that perltidy did that.  If you want to fix it so it
> stays fixed, maybe reordering the parameters could help.

You were right on that front. Since perltidy insisted on reflowing due to the
reduction in number of parameters anyway, I did end up switching things around
so that the parameters look a bit more reasonable.

> > So, does anybody see a reason not to go for the trivial
> > [ patch ]
> 
> I'd be happy to rely on the buildfarm's opinion here.

Let's see what it says...

Greetings,

Andres Freund




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-10-19 Thread Alvaro Herrera
On 2021-Oct-19, Alvaro Herrera wrote:

> Hmm, I think this should happen before the transaction snapshot is
> established in the worker; perhaps immediately after calling
> StartParallelWorkerTransaction(), or anyway not after
> SetTransactionSnapshot.  In fact, since SetTransactionSnapshot receives
> a 'sourceproc' argument, why not do it exactly there? ISTM that
> ProcArrayInstallRestoredXmin() is where this should happen.

... and there is a question about the lock strength used for
ProcArrayLock.  The current routine uses LW_SHARED, but there's no
clarity that we can modify proc->statusFlags and ProcGlobal->statusFlags
without LW_EXCLUSIVE.

Maybe we can change ProcArrayInstallRestoredXmin so that if it sees that
proc->statusFlags is not zero, then it grabs LW_EXCLUSIVE (and copies),
otherwise it keeps using LW_SHARED as it does now (and does not copy.)

(This also suggests that using LW_EXCLUSIVE inconditionally for all
cases as your patch does is not great.  OTOH it's just once at every
bgworker start, so it's not *that* frequent.)


Initially, I was a bit nervous about copying flags willy-nilly.  Do we
need to be more careful?  I mean, have a way for the code to specify
flags to copy, maybe something like

MyProc->statusFlags |= proc->statusFlags & copyableFlags;
ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;

with this coding,
1. we do not unset flags that the bgworker already has for whatever
reason
2. we do not copy flags that may be unrelated to the effect we desire.

The problem, and it's something I don't have an answer for, is how to
specify copyableFlags.  This code is the generic ParallelWorkerMain()
and there's little-to-no chance to pass stuff from the process that
requested the bgworker.  So maybe Sawada-san's original coding of just
copying everything is okay.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: pg_upgrade test chatter

2021-10-19 Thread Tom Lane
"Bossart, Nathan"  writes:
> I run 'make check-world' a lot, and I typically use parallelism and
> redirect stdout to /dev/null as suggested in the docs [0].  This seems
> to eliminate all of the test chatter except for this one message:

> NOTICE:  database "regression" does not exist, skipping

Yeah, that's bugged me too ever since we got to the point where that
was the only output ...

> We could also just create the
> database it is trying to drop to silence the NOTICE.

... but that seems like a mighty expensive way to fix it.
createdb is pretty slow on older/slower buildfarm animals.

Maybe we could run the stderr output through "grep -v", or the like?

regards, tom lane




Re: UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Zhihong Yu
On Tue, Oct 19, 2021 at 10:04 AM Tom Lane  wrote:

> [ please do not quote the entire thread when replying ]
>
> Zhihong Yu  writes:
> > Here is the patch.
>
> This patch seems quite misguided to me.  The proximate cause of
> the crash is that we're arriving at ExecEvalFieldStoreDeForm with
> *op->resnull and *op->resvalue both zero, which is a completely
> invalid situation for a pass-by-reference datatype; so something
> upstream of this messed up.  Even if there were an argument for
> acting as though that were a valid NULL value, this patch fails to
> do so; that'd require setting all the output fieldstore.nulls[]
> entries to true, which you didn't.
>
> Moreover, experiment quickly shows that the problem only shows up with
> an array of domain over composite, not an array of plain composite.
> The proposed patch doesn't seem to have anything to do with that
> observation.
>
> After some digging around, I see where the issue actually is:
> the expression tree we're dealing with looks like
>
>  {SUBSCRIPTINGREF
>  :refexpr
> {VAR
> }
>  :refassgnexpr
> {COERCETODOMAIN
> :arg
>{FIELDSTORE
>:arg
>   {CASETESTEXPR
>   }
>}
> }
>  }
>
> The array element we intend to replace has to be passed down to
> the CaseTestExpr, but that isn't happening.  That's because
> isAssignmentIndirectionExpr fails to recognize a tree like
> this, so ExecInitSubscriptingRef doesn't realize it needs to
> arrange for that.
>
> I believe the attached is a correct fix.
>
> regards, tom lane
>
> Hi,
Tom's patch fixes the update of individual field inside the domain type as
well.

Tom's patch looks good to me.


Re: UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Tom Lane
[ please do not quote the entire thread when replying ]

Zhihong Yu  writes:
> Here is the patch.

This patch seems quite misguided to me.  The proximate cause of
the crash is that we're arriving at ExecEvalFieldStoreDeForm with
*op->resnull and *op->resvalue both zero, which is a completely
invalid situation for a pass-by-reference datatype; so something
upstream of this messed up.  Even if there were an argument for
acting as though that were a valid NULL value, this patch fails to
do so; that'd require setting all the output fieldstore.nulls[]
entries to true, which you didn't.

Moreover, experiment quickly shows that the problem only shows up with
an array of domain over composite, not an array of plain composite.
The proposed patch doesn't seem to have anything to do with that
observation.

After some digging around, I see where the issue actually is:
the expression tree we're dealing with looks like

 {SUBSCRIPTINGREF 
 :refexpr 
{VAR 
}
 :refassgnexpr 
{COERCETODOMAIN 
:arg 
   {FIELDSTORE 
   :arg 
  {CASETESTEXPR 
  }
   }
}
 }

The array element we intend to replace has to be passed down to
the CaseTestExpr, but that isn't happening.  That's because
isAssignmentIndirectionExpr fails to recognize a tree like
this, so ExecInitSubscriptingRef doesn't realize it needs to
arrange for that.

I believe the attached is a correct fix.

regards, tom lane

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 81b9d87bad..33ef39e2d4 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -3092,11 +3092,14 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref,
  * (We could use this in FieldStore too, but in that case passing the old
  * value is so cheap there's no need.)
  *
- * Note: it might seem that this needs to recurse, but it does not; the
- * CaseTestExpr, if any, will be directly the arg or refexpr of the top-level
- * node.  Nested-assignment situations give rise to expression trees in which
- * each level of assignment has its own CaseTestExpr, and the recursive
- * structure appears within the newvals or refassgnexpr field.
+ * Note: it might seem that this needs to recurse, but in most cases it does
+ * not; the CaseTestExpr, if any, will be directly the arg or refexpr of the
+ * top-level node.  Nested-assignment situations give rise to expression
+ * trees in which each level of assignment has its own CaseTestExpr, and the
+ * recursive structure appears within the newvals or refassgnexpr field.
+ * There is an exception, though: if the array is an array-of-domain, we will
+ * have a CoerceToDomain as the refassgnexpr, and we need to be able to look
+ * through that.
  */
 static bool
 isAssignmentIndirectionExpr(Expr *expr)
@@ -3117,6 +3120,12 @@ isAssignmentIndirectionExpr(Expr *expr)
 		if (sbsRef->refexpr && IsA(sbsRef->refexpr, CaseTestExpr))
 			return true;
 	}
+	else if (IsA(expr, CoerceToDomain))
+	{
+		CoerceToDomain *cd = (CoerceToDomain *) expr;
+
+		return isAssignmentIndirectionExpr(cd->arg);
+	}
 	return false;
 }
 
diff --git a/src/test/regress/expected/domain.out b/src/test/regress/expected/domain.out
index 411d5c003e..a04bd00ac6 100644
--- a/src/test/regress/expected/domain.out
+++ b/src/test/regress/expected/domain.out
@@ -512,6 +512,30 @@ LINE 1: update dposintatable set (f1[2])[1] = array[98];
 drop table dposintatable;
 drop domain posint cascade;
 NOTICE:  drop cascades to type dposinta
+-- Test arrays over domains of composite
+create type comptype as (cf1 int, cf2 int);
+create domain dcomptype as comptype check ((value).cf1 > 0);
+create table dcomptable (f1 dcomptype[]);
+insert into dcomptable values (null);
+update dcomptable set f1[1].cf2 = 5;
+table dcomptable;
+f1
+--
+ {"(,5)"}
+(1 row)
+
+update dcomptable set f1[1].cf1 = -1;  -- fail
+ERROR:  value for domain dcomptype violates check constraint "dcomptype_check"
+update dcomptable set f1[1].cf1 = 1;
+table dcomptable;
+f1 
+---
+ {"(1,5)"}
+(1 row)
+
+drop table dcomptable;
+drop type comptype cascade;
+NOTICE:  drop cascades to type dcomptype
 -- Test not-null restrictions
 create domain dnotnull varchar(15) NOT NULL;
 create domain dnullvarchar(15);
diff --git a/src/test/regress/sql/domain.sql b/src/test/regress/sql/domain.sql
index 549c0b5adf..bf48c53e9c 100644
--- a/src/test/regress/sql/domain.sql
+++ b/src/test/regress/sql/domain.sql
@@ -267,6 +267,23 @@ drop table dposintatable;
 drop domain posint cascade;
 
 
+-- Test arrays over domains of composite
+
+create type comptype as (cf1 int, cf2 int);
+create domain dcomptype as comptype check ((value).cf1 > 0);
+
+create table dcomptable (f1 

Re: UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Japin Li


On Tue, 19 Oct 2021 at 23:17, Zhihong Yu  wrote:
> On Tue, Oct 19, 2021 at 2:12 AM Zhihong Yu  wrote:
>
>>
>>
>> On Tue, Oct 19, 2021 at 12:39 AM Onder Kalaci 
>> wrote:
>>
>>> Hi hackers,
>>>
>>>
>>>
>>> I couldn’t find a similar report to this one, so starting a new thread. I
>>> can reproduce this on v14.0 as well as PostgreSQL 12.5 (not tried below
>>> versions).
>>>
>>>
>>>
>>> Steps to reproduce:
>>>
>>>
>>>
>>> CREATE TYPE two_ints as (if1 int, if2 int);
>>>
>>> CREATE DOMAIN domain AS two_ints CHECK ((VALUE).if1 > 0);
>>>
>>> CREATE TABLE domain_indirection_test (f1 int, f3 domain, domain_array
>>> domain[]);
>>>
>>> INSERT INTO domain_indirection_test (f1,f3.if1) VALUES (0, 1);
>>>
>>> UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
>>>
>>> server closed the connection unexpectedly
>>>
>>> This probably means the server terminated abnormally
>>>
>>> before or while processing the request.
>>>
>>> The connection to the server was lost. Attempting reset: Failed.
>>>
>>> Time: 3.990 ms
>>>
>>> @:-!>
>>>
>>>
>>>
>>>
>>>
>>> The backtrace on PG 12.5 (As far as I remember, PG14 looks very similar) :
>>>
>>>
>>>
>>> (lldb) bt
>>>
>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>> EXC_BAD_ACCESS (code=1, address=0x0)
>>>
>>>   * frame #0: 0x0001036584b7
>>> postgres`pg_detoast_datum(datum=0x) at fmgr.c:1741:6
>>>
>>> frame #1: 0x000103439a86
>>> postgres`ExecEvalFieldStoreDeForm(state=,
>>> op=0x7f9212045df8, econtext=) at execExprInterp.c:3025:12
>>>
>>> frame #2: 0x00010343834e
>>> postgres`ExecInterpExpr(state=, econtext=,
>>> isnull=0x7ffeec91fdc7) at execExprInterp.c:1337:4
>>>
>>> frame #3: 0x00010343742b
>>> postgres`ExecInterpExprStillValid(state=0x7f921181db18,
>>> econtext=0x7f921181d670, isNull=) at
>>> execExprInterp.c:1778:9
>>>
>>> frame #4: 0x000103444e0d
>>> postgres`ExecEvalExprSwitchContext(state=0x7f921181db18,
>>> econtext=0x7f921181d670, isNull=) at executor.h:310:13
>>>
>>> frame #5: 0x000103444cf0
>>> postgres`ExecProject(projInfo=0x7f921181db10) at executor.h:344:9
>>>
>>> frame #6: 0x000103444af6
>>> postgres`ExecScan(node=0x7f921181d560, accessMtd=(postgres`SeqNext at
>>> nodeSeqscan.c:51), recheckMtd=(postgres`SeqRecheck at nodeSeqscan.c:90)) at
>>> execScan.c:239:12
>>>
>>> frame #7: 0x000103461d17
>>> postgres`ExecSeqScan(pstate=) at nodeSeqscan.c:112:9
>>>
>>> frame #8: 0x00010344375c
>>> postgres`ExecProcNodeFirst(node=0x7f921181d560) at execProcnode.c:445:9
>>>
>>> frame #9: 0x00010345eefe
>>> postgres`ExecProcNode(node=0x7f921181d560) at executor.h:242:9
>>>
>>> frame #10: 0x00010345e74f
>>> postgres`ExecModifyTable(pstate=0x7f921181d090) at
>>> nodeModifyTable.c:2079:14
>>>
>>> frame #11: 0x00010344375c
>>> postgres`ExecProcNodeFirst(node=0x7f921181d090) at execProcnode.c:445:9
>>>
>>> frame #12: 0x00010343f25e
>>> postgres`ExecProcNode(node=0x7f921181d090) at executor.h:242:9
>>>
>>> frame #13: 0x00010343d80d
>>> postgres`ExecutePlan(estate=0x7f921181cd10,
>>> planstate=0x7f921181d090, use_parallel_mode=,
>>> operation=CMD_UPDATE, sendTuples=false, numberTuples=0,
>>> direction=ForwardScanDirection, dest=0x7f9211818c38,
>>> execute_once=) at execMain.c:1646:10
>>>
>>> frame #14: 0x00010343d745
>>> postgres`standard_ExecutorRun(queryDesc=0x7f921180e310,
>>> direction=ForwardScanDirection, count=0, execute_once=true) at
>>> execMain.c:364:3
>>>
>>> frame #15: 0x00010343d67c
>>> postgres`ExecutorRun(queryDesc=0x7f921180e310,
>>> direction=ForwardScanDirection, count=0, execute_once=) at
>>> execMain.c:308:3
>>>
>>> frame #16: 0x0001035784a8
>>> postgres`ProcessQuery(plan=, sourceText=,
>>> params=, queryEnv=0x, dest=,
>>> completionTag="") at pquery.c:161:2
>>>
>>> frame #17: 0x000103577c5e
>>> postgres`PortalRunMulti(portal=0x7f9215024110, isTopLevel=true,
>>> setHoldSnapshot=false, dest=0x7f9211818c38, altdest=0x7f9211818c38,
>>> completionTag="") at pquery.c:0
>>>
>>> frame #18: 0x00010357763d
>>> postgres`PortalRun(portal=0x7f9215024110, count=9223372036854775807,
>>> isTopLevel=, run_once=, dest=0x7f9211818c38,
>>> altdest=0x7f9211818c38, completionTag="") at pquery.c:796:5
>>>
>>> frame #19: 0x000103574f87
>>> postgres`exec_simple_query(query_string="UPDATE domain_indirection_test SET
>>> domain_array[0].if2 = 5;") at postgres.c:1215:10
>>>
>>> frame #20: 0x0001035746b8
>>> postgres`PostgresMain(argc=, argv=,
>>> dbname=, username=) at postgres.c:0
>>>
>>> frame #21: 0x00010350d712 postgres`BackendRun(port=)
>>> at postmaster.c:4494:2
>>>
>>> frame #22: 0x00010350cffa
>>> postgres`BackendStartup(port=) at postmaster.c:4177:3
>>>
>>> frame #23: 0x00010350c59c postgres`ServerLoop at
>>> 

Re: XTS cipher mode for cluster file encryption

2021-10-19 Thread Robert Haas
On Tue, Oct 19, 2021 at 11:46 AM Sasasu  wrote:
> As there are so many threat models, I propose to do the TDE feature by a
> set of hooks.

This is too invasive to do using hooks. We are inevitably going to
need to make significant changes in core.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Unexpected behavior of updating domain array that is based on a composite

2021-10-19 Thread Japin Li


Hi, hackers

While reading the patch in [1], I found there is an unexpected behavior when
update the domain array that is based on a composite.

Steps to reproduce:

1)
CREATE TYPE two_ints as (if1 int, if2 int);
CREATE DOMAIN domain AS two_ints CHECK ((VALUE).if1 > 0);
CREATE TABLE domain_indirection_test (f1 int, f3 domain, domain_array domain[]);
INSERT INTO domain_indirection_test (f1,f3.if1) VALUES (0, 1);

2) The following test besed on patch in [1].
UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
select * from domain_indirection_test;
 f1 |  f3  |  domain_array
+--+
  0 | (1,) | [0:0]={"(,5)"}

3)
UPDATE domain_indirection_test SET domain_array[0].if1 = 10;
select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+-
  0 | (1,) | [0:0]={"(10,)"}
(1 row)

4)
UPDATE domain_indirection_test SET domain_array[0].if1 = 10, 
domain_array[0].if2 =  5;
select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+
  0 | (1,) | [0:0]={"(,5)"}
(1 row)

5) 
UPDATE domain_indirection_test SET domain_array[0].if2 = 10, 
domain_array[0].if1 =  5;
select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+
  0 | (1,) | [0:0]={"(5,)"}
(1 row)

6) Work as expected.
UPDATE domain_indirection_test SET domain_array[0] = (10, 5);
select * from domain_indirection_test ;
 f1 |  f3  |   domain_array
+--+--
  0 | (1,) | [0:0]={"(10,5)"}
(1 row)

[1] 
https://www.postgresql.org/message-id/PH0PR21MB132823A46AA36F0685B7A29AD8BD9%40PH0PR21MB1328.namprd21.prod.outlook.com
-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: parallelizing the archiver

2021-10-19 Thread Robert Haas
On Tue, Oct 19, 2021 at 10:19 AM Magnus Hagander  wrote:
> But, is logical decoding really that great an example? I mean, we build 
> pgoutput.so as a library, we don't provide it compiled-in. So we could build 
> the "shell archiver" based on that pattern, in which case we should create a 
> postmaster/shell_archiver directory or something like that?

Well, I guess you could also use parallel contexts as an example.
There, the core facilities that most people will use are baked into
the server, but you can provide your own in an extension and the
parallel context stuff will happily call it for you if you so request.

I don't think the details here are too important. I'm just saying that
not everything needs to depend on _PG_init() as a way of bootstrapping
itself. TBH, if I ran the zoo and also had infinite time to tinker
with stuff like this, I'd probably make a pass through the hooks we
already have and try to refactor as many of them as possible to use
some mechanism other than _PG_init() to bootstrap themselves. That
mechanism actually sucks. When we use other mechanisms -- like a
language "C" function that knows the shared object name and function
name -- then load is triggered when it's needed, and the user gets the
behavior they want. Similarly with logical decoding and FDWs -- you,
as the user, say that you want this or that kind of logical decoding
or FDW or C function or whatever -- and then the system either notices
that it's already loaded and does what you want, or notices that it's
not loaded and loads it, and then does what you want.

But when the bootstrapping mechanism is _PG_init(), then the user has
got to make sure the library is loaded at the correct time. They have
to know whether it should go into shared_preload_libraries or whether
it should be put into one of the other various GUCs or if it can be
loaded on the fly with LOAD. If they don't load it in the right way,
or if it doesn't get loaded at all, well then probably it just
silently doesn't work. Plus there can be weird cases if it gets loaded
into some backends but not others and things like that.

And here we seem to have an opportunity to improve the interface by
not depending on it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: parallelizing the archiver

2021-10-19 Thread Bossart, Nathan
On 10/19/21, 6:39 AM, "David Steele"  wrote:
> On 10/19/21 8:50 AM, Robert Haas wrote:
>> I am not quite sure why we wouldn't just compile the functions into
>> the server. Functions pointers can point to core functions as surely
>> as loadable modules. The present design isn't too congenial to that
>> because it's relying on the shared library loading mechanism to wire
>> the thing in place - but there's no reason it has to be that way.
>> Logical decoding plugins don't work that way, for example. We could
>> still have a GUC, say call it archive_method, that selects the module
>> -- with 'shell' being a builtin method, and others being loadable as
>> modules. If you set archive_method='shell' then you enable this
>> module, and it has its own GUC, say call it archive_command, to
>> configure the behavior.
>>
>> An advantage of this approach is that it's perfectly
>> backward-compatible. I understand that archive_command is a hateful
>> thing to many people here, but software has to serve the user base,
>> not just the developers. Lots of people use archive_command and rely
>> on it -- and are not interested in installing yet another piece of
>> out-of-core software to do what $OTHERDB has built in.
>
> +1 to all of this, certainly for the time being. The archive_command
> mechanism is not great, but it is simple, and this part is not really
> what makes writing a good archive command hard.
>
> I had also originally envisioned this a default extension in core, but
> having the default 'shell' method built-in is certainly simpler.

I have no problem building it this way.  It's certainly better for
backward compatibility, which I think everyone here feels is
important.

Robert's proposed design is a bit more like my original proof-of-
concept [0].  There, I added an archive_library GUC which was
basically an extension of shared_preload_libraries (which creates some
interesting problems in the library loading logic).  You could only
set one of archive_command or archive_library at any given time.  When
the archive_library was set, we ran that library's _PG_init() just
like we do for any other library, and then we set the archiver
function pointer to the library's _PG_archive() function.

IIUC the main difference between this design and what Robert proposes
is that we'd also move the existing archive_command stuff somewhere
else and then access it via the archiver function pointer.  I think
that is clearly better than branching based on whether archive_command
or archive_library is set.  (BTW I'm not wedded to these GUCs.  If
folks would rather create something like the archive_method GUC, I
think that would work just as well.)

My original proof-of-concept also attempted to handle a bunch of other
shell command GUCs, but perhaps I'd better keep this focused on
archive_command for now.  What we do here could serve as an example of
how to adjust the other shell command GUCs later on.  I'll go ahead
and rework my patch to look more like what is being discussed here,
although I expect the exact design for the interface will continue to
evolve based on the feedback in this thread.

Nathan

[0] https://postgr.es/m/E9035E94-EC76-436E-B6C9-1C03FBD8EF54%40amazon.com



Re: XTS cipher mode for cluster file encryption

2021-10-19 Thread Sasasu

On 2021/10/19 00:37, Robert Haas wrote:

I think what we ought to be doing at
this point is combining our efforts to try to get some things
committed which make future work in this area committed - like that
patch to preserve relfilenode and database OID, or maybe some patches
to drive all of our I/O through a smaller number of code paths instead
of having every different type of temporary file we write reinvent the
wheel.


A unified block-based I/O API sounds great. Has anyone tried to do this 
before? It would be nice if the front-end tools could also use these API.


As there are so many threat models, I propose to do the TDE feature by a 
set of hooks. those hooks are on the critical path of IO operation, add 
the ability to let extension replace the IO API. and also load extension 
when initdb, single-mode, and in front-end tools.
This sounds Like using $LD_PRELOAD to replace pread(2) and pwrite(2), 
which widely used in folder based encryption. but the hook will pass 
more context (filenode, tableoid, blocksize, and many) to the under 
layer, this hook API will look like object_access_hook.
then implement the simplest AES-XTS. and put it to contrib. provide a 
tool to deactivate AES-XTS to make PostgreSQL upgradeable.


I think this is the most peaceful method. GCM people will not reject 
this just because XTS. and XTS people will satisfied(maybe?) with the 
complexity. for performance, just one more long-jump compare with 
current AES-XTS code.


OpenPGP_0x4E72AF09097DAE2E.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: Added schema level support for publication.

2021-10-19 Thread vignesh C
On Tue, Oct 19, 2021 at 11:23 AM tanghy.f...@fujitsu.com
 wrote:
>
> On Tuesday, October 19, 2021 12:57 PM Amit Kapila  
> wrote:
> >
> > On Tue, Oct 19, 2021 at 9:15 AM tanghy.f...@fujitsu.com
> >  wrote:
> > >
> > > On Monday, October 18, 2021 8:23 PM vignesh C 
> > wrote:
> > > >
> > > > Thanks for the comments, the attached v42 patch has the fixes for the 
> > > > same.
> > >
> > > Thanks for your new patch.
> > >
> > > I tried your patch and found that the permission check for superuser 
> > > didn't work.
> > >
> > > For example:
> > > postgres=# create role r1;
> > > CREATE ROLE
> > > postgres=# grant all privileges on database postgres to r1;
> > > GRANT
> > > postgres=# set role r1;
> > > SET
> > > postgres=> create schema s1;
> > > CREATE SCHEMA
> > > postgres=> create publication pub for all tables in schema s1;
> > > CREATE PUBLICATION
> > >
> > > Role r1 is not superuser, but this role could create publication for all 
> > > tables in
> > schema
> > > successfully, I think it is related the following change. List 
> > > schemaidlist was
> > > not assigned yet. I think we should check it later.
> > >
> >
> > It seems this got broken in yesterday's patch version. Do you think it
> > is a good idea to add a test for this case?
> >
>
> Agreed. Thanks for your suggestion.
>
> I tried to add this test to publication.sql, a patch diff file for this test 
> is attached.

Thanks for the test case, I have merged it to the v43 version patch
attached at [1].
[1] - 
https://www.postgresql.org/message-id/CALDaNm2pJ49wAv%3DgEZrAP5%3D_apAzv_rgK3zjX-wfwCY%2BWWfT9w%40mail.gmail.com

Regards,
Vignesh




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-10-19 Thread Alvaro Herrera
Hmm, I think this should happen before the transaction snapshot is
established in the worker; perhaps immediately after calling
StartParallelWorkerTransaction(), or anyway not after
SetTransactionSnapshot.  In fact, since SetTransactionSnapshot receives
a 'sourceproc' argument, why not do it exactly there? ISTM that
ProcArrayInstallRestoredXmin() is where this should happen.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: pg_receivewal starting position

2021-10-19 Thread Ronan Dunklau
Hello,

Following recommendations, I stripped most of the features from the patch. For 
now we support only physical replication slots, and only provide the two fields 
of interest (restart_lsn, restart_tli) in addition to the slot type (fixed at 
physical) to not paint ourselves in a corner.

I also removed the part about pg_basebackup since other fixes have been 
proposed for that case. 

Le vendredi 1 octobre 2021, 09:05:18 CEST Michael Paquier a écrit :
> On Mon, Sep 06, 2021 at 04:17:28PM +0900, Michael Paquier wrote:
> > Using READ_REPLICATION_SLOT as the command name is fine, and it could
> > be extended with more fields if necessary, implemented now with only
> > what we think is useful.  Returning errors on cases that are still not
> > supported yet is fine, say for logical slots if we decide to reject
> > the case for now, and testrictions can always be lifted in the
> > future.
> 
> And marked as RwF as this was three weeks ago.  Please feel free to
> register a new entry if this is being worked on.
> --
> Michael


-- 
Ronan Dunklau>From 4147665664164eb597fdcc86637ec9c497c36660 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 2 Sep 2021 16:25:25 +0900
Subject: [PATCH v6 1/3] Add READ_REPLICATION_SLOT command

---
 doc/src/sgml/protocol.sgml  | 47 +++
 src/backend/replication/repl_gram.y | 16 +++-
 src/backend/replication/repl_scanner.l  |  1 +
 src/backend/replication/walsender.c | 87 +
 src/include/nodes/nodes.h   |  1 +
 src/include/nodes/replnodes.h   | 10 +++
 src/test/recovery/t/001_stream_rep.pl   | 47 ++-
 src/test/recovery/t/006_logical_decoding.pl |  9 ++-
 src/tools/pgindent/typedefs.list|  1 +
 9 files changed, 216 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index b95cc88599..51a15cc3da 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2067,6 +2067,53 @@ The commands accepted in replication mode are:
 
   
 
+  
+READ_REPLICATION_SLOT slot_name
+  READ_REPLICATION_SLOT
+
+
+ 
+  Read the information of a replication slot. Returns a tuple with
+  NULL values if the replication slot does not
+  exist. This command is currently only supported for physical slots.
+ 
+ 
+  In response to this command, the server will return a one-row result set,
+  containing the following fields:
+  
+   
+slot_type (text)
+
+ 
+  The replication slot's type, either physical or
+  NULL.
+ 
+
+   
+
+   
+restart_lsn (text)
+
+ 
+  The replication slot's restart_lsn.
+ 
+
+   
+
+   
+restart_tli (int4)
+
+ 
+  The timeline ID for the restart_lsn position,
+  when following the current timeline history.
+ 
+
+   
+  
+ 
+
+  
+
   
 START_REPLICATION [ SLOT slot_name ] [ PHYSICAL ] XXX/XXX [ TIMELINE tli ]
  START_REPLICATION
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 126380e2df..913a99da5a 100644
--- a/src/backend/replication/repl_gram.y
+++ b/src/backend/replication/repl_gram.y
@@ -64,6 +64,7 @@ static SQLCmd *make_sqlcmd(void);
 /* Keyword tokens. */
 %token K_BASE_BACKUP
 %token K_IDENTIFY_SYSTEM
+%token K_READ_REPLICATION_SLOT
 %token K_SHOW
 %token K_START_REPLICATION
 %token K_CREATE_REPLICATION_SLOT
@@ -94,7 +95,7 @@ static SQLCmd *make_sqlcmd(void);
 %type 	command
 %type 	base_backup start_replication start_logical_replication
 create_replication_slot drop_replication_slot identify_system
-timeline_history show sql_cmd
+read_replication_slot timeline_history show sql_cmd
 %type 	base_backup_legacy_opt_list generic_option_list
 %type 	base_backup_legacy_opt generic_option
 %type 	opt_timeline
@@ -120,6 +121,7 @@ opt_semicolon:	';'
 
 command:
 			identify_system
+			| read_replication_slot
 			| base_backup
 			| start_replication
 			| start_logical_replication
@@ -140,6 +142,18 @@ identify_system:
 }
 			;
 
+/*
+ * READ_REPLICATION_SLOT %s
+ */
+read_replication_slot:
+			K_READ_REPLICATION_SLOT var_name
+{
+	ReadReplicationSlotCmd *n = makeNode(ReadReplicationSlotCmd);
+	n->slotname = $2;
+	$$ = (Node *) n;
+}
+			;
+
 /*
  * SHOW setting
  */
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index c038a636c3..1b599c255e 100644
--- a/src/backend/replication/repl_scanner.l
+++ b/src/backend/replication/repl_scanner.l
@@ -85,6 +85,7 @@ identifier		{ident_start}{ident_cont}*
 BASE_BACKUP			{ return K_BASE_BACKUP; }
 FAST			{ return K_FAST; }
 IDENTIFY_SYSTEM		{ return K_IDENTIFY_SYSTEM; }
+READ_REPLICATION_SLOT	{ return K_READ_REPLICATION_SLOT; }
 SHOW		{ return K_SHOW; }
 LABEL			{ return K_LABEL; }
 NOWAIT	

Re: UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Zhihong Yu
On Tue, Oct 19, 2021 at 2:12 AM Zhihong Yu  wrote:

>
>
> On Tue, Oct 19, 2021 at 12:39 AM Onder Kalaci 
> wrote:
>
>> Hi hackers,
>>
>>
>>
>> I couldn’t find a similar report to this one, so starting a new thread. I
>> can reproduce this on v14.0 as well as PostgreSQL 12.5 (not tried below
>> versions).
>>
>>
>>
>> Steps to reproduce:
>>
>>
>>
>> CREATE TYPE two_ints as (if1 int, if2 int);
>>
>> CREATE DOMAIN domain AS two_ints CHECK ((VALUE).if1 > 0);
>>
>> CREATE TABLE domain_indirection_test (f1 int, f3 domain, domain_array
>> domain[]);
>>
>> INSERT INTO domain_indirection_test (f1,f3.if1) VALUES (0, 1);
>>
>> UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
>>
>> server closed the connection unexpectedly
>>
>> This probably means the server terminated abnormally
>>
>> before or while processing the request.
>>
>> The connection to the server was lost. Attempting reset: Failed.
>>
>> Time: 3.990 ms
>>
>> @:-!>
>>
>>
>>
>>
>>
>> The backtrace on PG 12.5 (As far as I remember, PG14 looks very similar) :
>>
>>
>>
>> (lldb) bt
>>
>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>> EXC_BAD_ACCESS (code=1, address=0x0)
>>
>>   * frame #0: 0x0001036584b7
>> postgres`pg_detoast_datum(datum=0x) at fmgr.c:1741:6
>>
>> frame #1: 0x000103439a86
>> postgres`ExecEvalFieldStoreDeForm(state=,
>> op=0x7f9212045df8, econtext=) at execExprInterp.c:3025:12
>>
>> frame #2: 0x00010343834e
>> postgres`ExecInterpExpr(state=, econtext=,
>> isnull=0x7ffeec91fdc7) at execExprInterp.c:1337:4
>>
>> frame #3: 0x00010343742b
>> postgres`ExecInterpExprStillValid(state=0x7f921181db18,
>> econtext=0x7f921181d670, isNull=) at
>> execExprInterp.c:1778:9
>>
>> frame #4: 0x000103444e0d
>> postgres`ExecEvalExprSwitchContext(state=0x7f921181db18,
>> econtext=0x7f921181d670, isNull=) at executor.h:310:13
>>
>> frame #5: 0x000103444cf0
>> postgres`ExecProject(projInfo=0x7f921181db10) at executor.h:344:9
>>
>> frame #6: 0x000103444af6
>> postgres`ExecScan(node=0x7f921181d560, accessMtd=(postgres`SeqNext at
>> nodeSeqscan.c:51), recheckMtd=(postgres`SeqRecheck at nodeSeqscan.c:90)) at
>> execScan.c:239:12
>>
>> frame #7: 0x000103461d17
>> postgres`ExecSeqScan(pstate=) at nodeSeqscan.c:112:9
>>
>> frame #8: 0x00010344375c
>> postgres`ExecProcNodeFirst(node=0x7f921181d560) at execProcnode.c:445:9
>>
>> frame #9: 0x00010345eefe
>> postgres`ExecProcNode(node=0x7f921181d560) at executor.h:242:9
>>
>> frame #10: 0x00010345e74f
>> postgres`ExecModifyTable(pstate=0x7f921181d090) at
>> nodeModifyTable.c:2079:14
>>
>> frame #11: 0x00010344375c
>> postgres`ExecProcNodeFirst(node=0x7f921181d090) at execProcnode.c:445:9
>>
>> frame #12: 0x00010343f25e
>> postgres`ExecProcNode(node=0x7f921181d090) at executor.h:242:9
>>
>> frame #13: 0x00010343d80d
>> postgres`ExecutePlan(estate=0x7f921181cd10,
>> planstate=0x7f921181d090, use_parallel_mode=,
>> operation=CMD_UPDATE, sendTuples=false, numberTuples=0,
>> direction=ForwardScanDirection, dest=0x7f9211818c38,
>> execute_once=) at execMain.c:1646:10
>>
>> frame #14: 0x00010343d745
>> postgres`standard_ExecutorRun(queryDesc=0x7f921180e310,
>> direction=ForwardScanDirection, count=0, execute_once=true) at
>> execMain.c:364:3
>>
>> frame #15: 0x00010343d67c
>> postgres`ExecutorRun(queryDesc=0x7f921180e310,
>> direction=ForwardScanDirection, count=0, execute_once=) at
>> execMain.c:308:3
>>
>> frame #16: 0x0001035784a8
>> postgres`ProcessQuery(plan=, sourceText=,
>> params=, queryEnv=0x, dest=,
>> completionTag="") at pquery.c:161:2
>>
>> frame #17: 0x000103577c5e
>> postgres`PortalRunMulti(portal=0x7f9215024110, isTopLevel=true,
>> setHoldSnapshot=false, dest=0x7f9211818c38, altdest=0x7f9211818c38,
>> completionTag="") at pquery.c:0
>>
>> frame #18: 0x00010357763d
>> postgres`PortalRun(portal=0x7f9215024110, count=9223372036854775807,
>> isTopLevel=, run_once=, dest=0x7f9211818c38,
>> altdest=0x7f9211818c38, completionTag="") at pquery.c:796:5
>>
>> frame #19: 0x000103574f87
>> postgres`exec_simple_query(query_string="UPDATE domain_indirection_test SET
>> domain_array[0].if2 = 5;") at postgres.c:1215:10
>>
>> frame #20: 0x0001035746b8
>> postgres`PostgresMain(argc=, argv=,
>> dbname=, username=) at postgres.c:0
>>
>> frame #21: 0x00010350d712 postgres`BackendRun(port=)
>> at postmaster.c:4494:2
>>
>> frame #22: 0x00010350cffa
>> postgres`BackendStartup(port=) at postmaster.c:4177:3
>>
>> frame #23: 0x00010350c59c postgres`ServerLoop at
>> postmaster.c:1725:7
>>
>> frame #24: 0x00010350ac8d postgres`PostmasterMain(argc=3,
>> argv=0x7f9210d049c0) at postmaster.c:1398:11
>>
>> frame #25: 0x00010347fbdd postgres`main(argc=,
>> argv=) 

Re: .ready and .done files considered harmful

2021-10-19 Thread Bossart, Nathan
On 10/19/21, 5:59 AM, "Robert Haas"  wrote:
> Nathan, I just realized we never closed the loop on this. Do you have
> any thoughts?

IMO the patch is in decent shape.  Happy to address any feedback you
might have on the latest patch [0].

Nathan

[0] 
https://www.postgresql.org/message-id/attachment/126789/v3-0001-Improve-performance-of-pgarch_readyXlog-with-many.patch



Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable

2021-10-19 Thread Bossart, Nathan
On 10/18/21, 11:49 PM, "Matthijs van der Vleuten"  wrote:
> The test case doesn't seem entirely correct to me? The index being
> dropped (btree_tall_tbl_idx2) doesn't exist.

This was fixed before it was committed [0].

> Also, I don't believe this tests the case of dropping the index when
> it previously has been altered in this way.

That can still fail with the "has no options" ERROR, and fixing it
will still require a manual catalog update.  The ERROR is actually
coming from the call to index_open(), so bypassing it might require
some rather intrusive changes.  Given that it took over a year for
this bug to be reported, I suspect it might be more trouble than it's
worth.

Nathan

[0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=fdd8857



Re: parallelizing the archiver

2021-10-19 Thread Magnus Hagander
On Tue, Oct 19, 2021 at 2:50 PM Robert Haas  wrote:

> On Mon, Oct 18, 2021 at 7:25 PM Bossart, Nathan 
> wrote:
> > I think the biggest question is where to put the archive_command
> > module, which I've called shell_archive.  The only existing directory
> > that looked to me like it might work is src/test/modules.  It might be
> > rather bold to relegate this functionality to a test module so
> > quickly, but on the other hand, perhaps it's the right thing to do
> > given we intend to deprecate it in the future.  I'm curious what
> > others think about this.
>
> I don't see that as being a viable path forward based on my customer
> interactions working here at EDB.
>
> I am not quite sure why we wouldn't just compile the functions into
> the server. Functions pointers can point to core functions as surely
> as loadable modules. The present design isn't too congenial to that
> because it's relying on the shared library loading mechanism to wire
> the thing in place - but there's no reason it has to be that way.
> Logical decoding plugins don't work that way, for example. We could
> still have a GUC, say call it archive_method, that selects the module
> -- with 'shell' being a builtin method, and others being loadable as
> modules. If you set archive_method='shell' then you enable this
> module, and it has its own GUC, say call it archive_command, to
> configure the behavior.
>

Yeah, seems reasonable. It wouldn't serve as well as an example to
developers, but then it's probably not the "loadable module" part of
building it that people need examples of. So as long as it's using the same
internal APIs and just happens to be compiled in by default, I see no
problem with that.

But, is logical decoding really that great an example? I mean, we build
pgoutput.so as a library, we don't provide it compiled-in. So we could
build the "shell archiver" based on that pattern, in which case we should
create a postmaster/shell_archiver directory or something like that?

It should definitely not go under "test".


An advantage of this approach is that it's perfectly
> backward-compatible. I understand that archive_command is a hateful
> thing to many people here, but software has to serve the user base,
> not just the developers. Lots of people use archive_command and rely
> on it -- and are not interested in installing yet another piece of
> out-of-core software to do what $OTHERDB has built in.
>

Backwards compatibility is definitely a must, I'd say. Regardless of
exactly how the backwards-compatible pugin is shipped, it should be what's
turned on by default.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: parallelizing the archiver

2021-10-19 Thread David Steele

On 10/19/21 8:50 AM, Robert Haas wrote:

On Mon, Oct 18, 2021 at 7:25 PM Bossart, Nathan  wrote:

I think the biggest question is where to put the archive_command
module, which I've called shell_archive.  The only existing directory
that looked to me like it might work is src/test/modules.  It might be
rather bold to relegate this functionality to a test module so
quickly, but on the other hand, perhaps it's the right thing to do
given we intend to deprecate it in the future.  I'm curious what
others think about this.


I don't see that as being a viable path forward based on my customer
interactions working here at EDB.

I am not quite sure why we wouldn't just compile the functions into
the server. Functions pointers can point to core functions as surely
as loadable modules. The present design isn't too congenial to that
because it's relying on the shared library loading mechanism to wire
the thing in place - but there's no reason it has to be that way.
Logical decoding plugins don't work that way, for example. We could
still have a GUC, say call it archive_method, that selects the module
-- with 'shell' being a builtin method, and others being loadable as
modules. If you set archive_method='shell' then you enable this
module, and it has its own GUC, say call it archive_command, to
configure the behavior.

An advantage of this approach is that it's perfectly
backward-compatible. I understand that archive_command is a hateful
thing to many people here, but software has to serve the user base,
not just the developers. Lots of people use archive_command and rely
on it -- and are not interested in installing yet another piece of
out-of-core software to do what $OTHERDB has built in.


+1 to all of this, certainly for the time being. The archive_command 
mechanism is not great, but it is simple, and this part is not really 
what makes writing a good archive command hard.


I had also originally envisioned this a default extension in core, but 
having the default 'shell' method built-in is certainly simpler.


Regards,
--
-David
da...@pgmasters.net




Re: Partial aggregates pushdown

2021-10-19 Thread Tomas Vondra

On 10/19/21 08:56, Alexander Pyhalov wrote:

Hi.

Tomas Vondra писал 2021-10-15 17:56:

As for the proposed approach, it's probably good enough for the first
version to restrict this to aggregates where the aggregate result is
sufficient, i.e. we don't need any new export/import procedures.

But it's very unlikely we'd want to restrict it the way the patch does
it, i.e. based on aggregate name. That's both fragile (people can
create new aggregates with such name) and against the PostgreSQL
extensibility (people may implement custom aggregates, but won't be
able to benefit from this just because of name).

So for v0 maybe, but I think there neeeds to be a way to relax this in
some way, for example we could add a new flag to pg_aggregate to mark
aggregates supporting this.



Updated patch to mark aggregates as pushdown-safe in pg_aggregates.

So far have no solution for aggregates with internal aggtranstype.


Thanks. Please add it to the next CF, so that we don't lose track of it.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Refactoring pg_dump's getTables()

2021-10-19 Thread Tom Lane
Daniel Gustafsson  writes:
> On 17 Oct 2021, at 22:05, Alvaro Herrera  wrote:
>> Maybe I would group together the changes that all require the same version
>> test, rather than keeping the output columns in the same order.

> I agree with that, if we're doing all this we might as well go all the way for
> readability.

OK, I'll make it so.

regards, tom lane




pgstat_assert_is_up() can fail in walsender

2021-10-19 Thread Amit Langote
Hi,

I can (almost) consistently reproduce $subject by executing the
attached shell script, which I was using while constructing a test
case for another thread.

The backtrace on the assert failure is this:

(gdb) bt
#0  0x7fce0b018387 in raise () from /lib64/libc.so.6
#1  0x7fce0b019a78 in abort () from /lib64/libc.so.6
#2  0x00b0bdfc in ExceptionalCondition (conditionName=0xcc0828
"pgstat_is_initialized && !pgstat_is_shutdown",
errorType=0xcc01b0 "FailedAssertion", fileName=0xcbfe12
"pgstat.c", lineNumber=4852) at assert.c:69
#3  0x008ac51e in pgstat_assert_is_up () at pgstat.c:4852
#4  0x008a9623 in pgstat_send (msg=0x7ffd16db3240, len=144) at
pgstat.c:3075
#5  0x008a7cbf in pgstat_report_replslot_drop (
slotname=0x7fce02dc6720 "pg_16399_sync_16389_7020757232905881693")
at pgstat.c:1869
#6  0x008fb06b in ReplicationSlotDropPtr (slot=0x7fce02dc6708)
at slot.c:696
#7  0x008fadbc in ReplicationSlotDropAcquired () at slot.c:585
#8  0x008faa8a in ReplicationSlotRelease () at slot.c:482
#9  0x009697c2 in ProcKill (code=1, arg=0) at proc.c:852
#10 0x00940878 in shmem_exit (code=1) at ipc.c:272
#11 0x009406a5 in proc_exit_prepare (code=1) at ipc.c:194
#12 0x009405fc in proc_exit (code=1) at ipc.c:107
#13 0x00b0c796 in errfinish (filename=0xce8525 "postgres.c",
lineno=3193,
funcname=0xcea370 <__func__.24551> "ProcessInterrupts") at elog.c:666
#14 0x00976ce4 in ProcessInterrupts () at postgres.c:3191
#15 0x00908023 in WalSndWaitForWal (loc=16785408) at walsender.c:1406
#16 0x00906f58 in logical_read_xlog_page (state=0x191d9e0,
targetPagePtr=16777216, reqLen=8192,
targetRecPtr=22502048, cur_page=0x1929150 "") at walsender.c:821
#17 0x0059f450 in ReadPageInternal (state=0x191d9e0,
pageptr=22495232, reqLen=6840) at xlogreader.c:649
#18 0x0059ec2e in XLogReadRecord (state=0x191d9e0,
errormsg=0x7ffd16db3e68) at xlogreader.c:337
#19 0x008d48fe in DecodingContextFindStartpoint
(ctx=0x191d620) at logical.c:606
#20 0x0090769a in CreateReplicationSlot (cmd=0x191c000) at
walsender.c:1038
#21 0x0090851f in exec_replication_command (
cmd_string=0x185f470 "CREATE_REPLICATION_SLOT
\"pg_16399_sync_16389_7020757232905881693\" LOGICAL pgoutput (SNAPSHOT
'use')") at walsender.c:1636
#22 0x009783d1 in PostgresMain (dbname=0x18896e8 "postgres",
username=0x18896c8 "amit") at postgres.c:4493
#23 0x008b3e1d in BackendRun (port=0x1880fb0) at postmaster.c:4560
#24 0x008b37bd in BackendStartup (port=0x1880fb0) at postmaster.c:4288
#25 0x008afd03 in ServerLoop () at postmaster.c:1801
#26 0x008af5da in PostmasterMain (argc=5, argv=0x1859ba0) at
postmaster.c:1473
#27 0x007b074b in main (argc=5, argv=0x1859ba0) at main.c:198

cc'ing Andres and Horiguchi-san as pgstat_assert_is_up() is added in
the recent commit ee3f8d3d3ae, though not sure if the problem is that
commit's fault.  I wonder if it may be of the adjacent commit
fb2c5028e635.

-- 
Amit Langote
EDB: http://www.enterprisedb.com
pubnode_dir=/tmp/pubnode
pubnode_port=54331
subnode_dir=/tmp/subnode
subnode_port=54332
initdb -D $pubnode_dir > /dev/null 2>&1
echo "wal_level=logical" >> $pubnode_dir/postgresql.conf
pg_ctl -D $pubnode_dir -o "-p $pubnode_port" start
psql -p $pubnode_port -c "create table objects (color text) partition by list (color)"
psql -p $pubnode_port -c "create table objects_green partition of objects for values in ('green')"
psql -p $pubnode_port -c "create table objects_yellow partition of objects for values in ('yellow')"
psql -p $pubnode_port -c "create publication pub for table objects, objects_green with (publish_via_partition_root = true)"
psql -p $pubnode_port -c "insert into objects values ('green'), ('yellow')"


initdb -D $subnode_dir > /dev/null 2>&1
pg_ctl -D $subnode_dir -o "-p $subnode_port" start
psql -p $subnode_port -c "create table objects (color text)"
psql -p $subnode_port -c "create table objects_green (color text)"
psql -p $subnode_port -c "create table objects_yellow (color text)"
psql -p $subnode_port -c "create subscription sub connection 'port=$pubnode_port' publication pub"
psql -p $subnode_port -c "table objects"
psql -p $subnode_port -c "table objects_green"

pg_ctl -D $pubnode_dir stop
rm -r $pubnode_dir
pg_ctl -D $subnode_dir stop
rm -r $subnode_dir


Re: Refactoring pg_dump's getTables()

2021-10-19 Thread Daniel Gustafsson
> On 17 Oct 2021, at 22:05, Alvaro Herrera  wrote:
> 
> On 2021-Oct-16, Tom Lane wrote:
> 
>> Attached is a proposed patch that refactors getTables() along the
>> same lines as some previous work (eg 047329624, ed2c7f65b, daa9fe8a5)
>> to avoid having multiple partially-redundant copies of the SQL query.
>> This gets rid of nearly 300 lines of duplicative spaghetti code,
>> creates a uniform style for dealing with cross-version changes
>> (replacing at least three different methods currently being used
>> for that in this same stretch of code), and allows moving some
>> comments to be closer to the code they describe.
> 
> Yeah, this seems a lot better than the original coding.

+1

> Maybe I would group together the changes that all require the same version
> test, rather than keeping the output columns in the same order.


I agree with that, if we're doing all this we might as well go all the way for
readability.

--
Daniel Gustafsson   https://vmware.com/





Re: when the startup process doesn't (logging startup delays)

2021-10-19 Thread Nitin Jadhav
> Apparently not, but here's a v2 anyway. In this version I made
> enable_timeout_every() a three-argument function, so that the caller
> can specify both the first time at which the timeout routine should be
> called and the interval between them, instead of only the latter. That
> seems to be more convenient for this use case, and is more powerful in
> general.

Thanks for sharing the patch. Overall approach looks good to me. But
just one concern about using enable_timeout_every() functionality. As
per my understanding the caller should calculate the first scheduled
timeout (now + interval) and pass it as the second argument but this
is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have
done the changes as I have mentioned (like now + interval). Kindly
correct me if I am wrong. I am attaching 2 patches here.
'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is
the same as Robert's v2 patch. I have rebased my patch on top of this
and it is 'v19-0002-startup-progress.patch'.

> I just noticed something else which I realize is the indirect result
> of my own suggestion but which doesn't actually look all that nice.
> You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT,
> ...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then
> another one in StartupProcessMain(). I think that's so that the
> feature works in single-user mode, but that means that technically,
> we're not reporting on the progress of the startup process. We're
> reporting progress on the startup operations that are normally
> performed by the startup process. But that means that the
> documentation isn't quite accurate (because it mentions the startup
> process specifically) and that the placement of the code in startup.c
> is suspect (because that's specifically for the startup process) and
> that basically every instance of the word "process" in the patch is
> technically a little bit wrong. I'm not sure if that's a big enough
> problem to be worth worrying about or exactly what we ought to do
> about it, but it doesn't seem fantastic. At a minimum, I think we
> should probably try to eliminate as many references to the startup
> process as we can, and talk about startup progress or startup
> operations or something like that.

Yeah right. I have modified the comments accordingly and also fixed
the other review comments related to the code comments.

I have modified the code to include a call to RegisterTimeout() in
only one place than the two calls previously. Initially I thought to
call this in begin_startup_progress_phase(). I feel this is not a
better choice since begin_startup_progress_phase function is getting
called in many places. So it calls RegisterTimeout() many times which
is not required. I feel StartupXLOG() is a better place for this since
StartupXLOG() gets called during the startup process, bootstrap
process or standalone backend. As per earlier discussion we need
support for this in the case of startup process and standalone
backend. Hence guarded this with '!IsBootstrapProcessingMode()'. Also
verified the behaviour in both of the cases. Please correct me if I am
wrong.

Thanks & Regards,
Nitin Jadhav

On Mon, Oct 18, 2021 at 9:15 PM Robert Haas  wrote:
>
> On Thu, Sep 30, 2021 at 5:08 PM Robert Haas  wrote:
> > Any thoughts on the patch I attached?
>
> Apparently not, but here's a v2 anyway. In this version I made
> enable_timeout_every() a three-argument function, so that the caller
> can specify both the first time at which the timeout routine should be
> called and the interval between them, instead of only the latter. That
> seems to be more convenient for this use case, and is more powerful in
> general.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com


v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch
Description: Binary data


v19-0002-startup-progress.patch
Description: Binary data


Re: .ready and .done files considered harmful

2021-10-19 Thread Robert Haas
On Fri, Sep 24, 2021 at 12:28 PM Robert Haas  wrote:
> On Thu, Sep 16, 2021 at 7:26 PM Bossart, Nathan  wrote:
> > What do you think?
>
> I think this is committable. I also went back and looked at your
> previous proposal to do files in batches, and I think that's also
> committable. After some reflection, I think I have a slight preference
> for the batching approach.
> It seems like it might lend itself to archiving multiple files in a
> single invocation of the archive_command, and Alvaro just suggested it
> again apparently not having realized that it had been previously
> proposed by Andres, so I guess it has the further advantage of being
> the thing that several committers intuitively feel like we ought to be
> doing to solve this problem.
>
> So what I am inclined to do is commit
> v1-0001-Improve-performance-of-pgarch_readyXlog-with-many.patch.
> However, v6-0001-Do-fewer-directory-scans-of-archive_status.patch has
> perhaps evolved a bit more than the other one, so I thought I should
> first ask whether any of those changes have influenced your thinking
> about the batching approach and whether you want to make any updates
> to that patch first. I don't really see that this is needed, but I
> might be missing something.

Nathan, I just realized we never closed the loop on this. Do you have
any thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: parallelizing the archiver

2021-10-19 Thread Robert Haas
On Mon, Oct 18, 2021 at 7:25 PM Bossart, Nathan  wrote:
> I think the biggest question is where to put the archive_command
> module, which I've called shell_archive.  The only existing directory
> that looked to me like it might work is src/test/modules.  It might be
> rather bold to relegate this functionality to a test module so
> quickly, but on the other hand, perhaps it's the right thing to do
> given we intend to deprecate it in the future.  I'm curious what
> others think about this.

I don't see that as being a viable path forward based on my customer
interactions working here at EDB.

I am not quite sure why we wouldn't just compile the functions into
the server. Functions pointers can point to core functions as surely
as loadable modules. The present design isn't too congenial to that
because it's relying on the shared library loading mechanism to wire
the thing in place - but there's no reason it has to be that way.
Logical decoding plugins don't work that way, for example. We could
still have a GUC, say call it archive_method, that selects the module
-- with 'shell' being a builtin method, and others being loadable as
modules. If you set archive_method='shell' then you enable this
module, and it has its own GUC, say call it archive_command, to
configure the behavior.

An advantage of this approach is that it's perfectly
backward-compatible. I understand that archive_command is a hateful
thing to many people here, but software has to serve the user base,
not just the developers. Lots of people use archive_command and rely
on it -- and are not interested in installing yet another piece of
out-of-core software to do what $OTHERDB has built in.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Added schema level support for publication.

2021-10-19 Thread Amit Kapila
On Mon, Oct 18, 2021 at 5:53 PM vignesh C  wrote:
>

Few comments on latest set of patches:
===
1.
+/*
+ * Filter out the partitions whose parent tables was also specified in
+ * the publication.
+ */
+static List *
+filter_out_partitions(List *relids)

Can we name this function as filter_partitions()?

2.
+ /*
+ * If the publication publishes partition changes via their
+ * respective root partitioned tables, we must exclude partitions
+ * in favor of including the root partitioned tables. Otherwise,
+ * the function could return both the child and parent tables which
+ * could cause the data of child table double-published in
+ * subscriber side.
+ */

Let's slightly change the last part of the line in the above comment
as: "... which could cause data of the child table to be
double-published on the subscriber side."

3.
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
..
..
@@ -38,7 +40,6 @@
 #include "utils/builtins.h"
 #include "utils/catcache.h"
 #include "utils/fmgroids.h"
-#include "utils/inval.h"
 #include "utils/lsyscache.h"

Does this change belong to this patch? If not, maybe you can submit a
separate patch for this. A similar change is present in
publicationcmds.c as well, not sure if that is required as well.

4.
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
...
...
+#include "nodes/makefuncs.h"

Do we need to include this file? I am able to compile without
including this file.

v42-0003-Add-tests-for-the-schema-publication-feature-of-
5.
+-- pg_publication_tables
+SET client_min_messages = 'ERROR';
+CREATE SCHEMA sch1;
+CREATE SCHEMA sch2;
+CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a);
+CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM
(1) to (10);
+CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch2 WITH
(PUBLISH_VIA_PARTITION_ROOT=1);
+SELECT * FROM pg_publication_tables;
+
+DROP PUBLICATION pub;
+CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH
(PUBLISH_VIA_PARTITION_ROOT=1);
+SELECT * FROM pg_publication_tables;

Can we expand the above comment on the lines of: "Test the list of
partitions published"?

v42-0004-Add-documentation-for-the-schema-publication-fea
6.
+ 
+  pg_publication_namespace
+  schema to publication mapping
+ 
+
  
   pg_publication_rel
   relation to publication mapping
@@ -6238,6 +6243,67 @@ SCRAM-SHA-256$iteration
count:
   
  

+ 
+  pg_publication_namespace

At one place, the new catalog is placed after pg_publication_rel and
at another place, it is before it. Shouldn't it be before in both
places as we have a place as per naming order?

7.
The
+   ADD clause will add one or more tables/schemas to the
+   publication. The DROP clauses will remove one or more
+   tables/schemas from the publication.

Isn't it better to write the above as one line: "The
ADD and DROP clauses will add
and remove one or more tables/schemas from the publication."?

8.
+  
+   Add some schemas to the publication:
+
+ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA
marketing_june, sales_june;
+
+  

Can we change schema names to just marketing and sales? Also, let's
change the description as:"Add schemas
marketing and sales
to the publication sales_publication?

9.
+[ FOR ALL TABLES
+  | FOR publication
object [, ... ] ]
 [ WITH ( publication_parameter [= value] [, ... ] ) ]
+
+where publication
object is one of:

Similar to Alter Publication, here also we should use
publication_object instead of publication object.

10.
+  
+   Create a publication that publishes all changes for tables "users" and
+   "departments" and that publishes all changes for all the tables present in
+   the schema "production":
+
+CREATE PUBLICATION production_publication FOR TABLE users,
departments, ALL TABLES IN SCHEMA production;
+
+  
+
+  
+   Create a publication that publishes all changes for all the tables
present in
+   the schemas "marketing" and "sales":

It is better to use  before and  after schema
names in above descriptions.

-- 
With Regards,
Amit Kapila.




Re: Fixing build of MSVC with OpenSSL 3.0.0

2021-10-19 Thread Daniel Gustafsson
> On 19 Oct 2021, at 12:52, Michael Paquier  wrote:
> 
> On Tue, Oct 19, 2021 at 10:34:10AM +0200, Daniel Gustafsson wrote:
>> I think we can tighten the check for GetOpenSSLVersion() a bit since we now 
>> now
>> the range of version in the 1.x.x series.  For these checks we know we want
>> 1.1.x or 3.x.x, but never 2.x.x etc.
>> 
>> How about the (untested) attached which encodes that knowledge, as well as 
>> dies
>> on too old OpenSSL versions?
> 
> One assumption hidden behind the scripts of src/tools/msvc/ is that we
> have never needed to support OpenSSL <= 1.0.1 these days

Right, I was going off the version stated in the documentation which doesn't
list per OS requirements.

> ..(see for
> example HAVE_X509_GET_SIGNATURE_NID always set to 1, introduced in
> 1.0.2) because the buildfarm has no need for it and there is no MSI
> for this version for years (except if compiling from source, but
> nobody would do that for an older version anyway with their right
> mind).  If you try, you would already get a compilation failure pretty
> quickly.  So I'd rather keep the code as-is and not add the extra
> sudden-death check.

Fair enough, there isn't much use in protecting against issues that will never
happen.

The other proposal, making sure that we don't see a version 2.x.x creep in (in
case a packager decides to play cute like how has happened in other OS's) seem
sane to me, but I'm also not very well versed in Windows so you be the judge
there.

--
Daniel Gustafsson   https://vmware.com/





Re: Add jsonlog log_destination for JSON server logs

2021-10-19 Thread Michael Paquier
On Fri, Oct 08, 2021 at 12:28:58PM +0900, Michael Paquier wrote:
> 0001 and 0003 have been applied independently, attached is a rebased
> version.

Attached are rebased versions of the patch set, where I have done a
cleaner split:
- 0001 includes all the refactoring of the routines from elog.c.
- 0002 moves csv logging into its own file.
- 0003 introduces the JSON log.

0001 and 0002, the refactoring bits, are in a rather committable
shape, so I'd like to apply that as the last refactoring pieces I know
of for this thread.  0003 still needs a closer lookup, and one part I
do not like much in it is the split for [u]int and long values when it
comes to key and values.
--
Michael
From 74a4d36e5f6cac1318a14168321c34edcd7d9b86 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 19 Oct 2021 16:25:45 +0900
Subject: [PATCH v6 1/3] Some refactoring of elog-specific routines

This refactors out the following things in elog.c, for ease of use
across multiple log destinations:
- start_timestamp (including reset)
- log_timestamp
- decide if query can be logged
- backend type
- write using the elog piped protocol
- Error severity to string.

These will be reused by csvlog and jsonlog.
---
 src/include/utils/elog.h   |  12 +++
 src/backend/utils/error/elog.c | 159 +
 2 files changed, 114 insertions(+), 57 deletions(-)

diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..731f3e3cd8 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -442,6 +442,18 @@ extern void DebugFileOpen(void);
 extern char *unpack_sql_state(int sql_state);
 extern bool in_error_recursion_trouble(void);
 
+/* Common functions shared across destinations */
+extern void reset_formatted_start_time(void);
+extern char *get_formatted_start_time(void);
+extern char *get_formatted_log_time(void);
+extern const char *get_backend_type_for_log(void);
+extern bool	check_log_of_query(ErrorData *edata);
+extern const char *error_severity(int elevel);
+extern void write_pipe_chunks(char *data, int len, int dest);
+
+/* Destination-specific functions */
+extern void write_csvlog(ErrorData *edata);
+
 #ifdef HAVE_SYSLOG
 extern void set_syslog_parameters(const char *ident, int facility);
 #endif
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index f33729513a..a162258bab 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -175,15 +175,10 @@ static const char *err_gettext(const char *str) pg_attribute_format_arg(1);
 static pg_noinline void set_backtrace(ErrorData *edata, int num_skip);
 static void set_errdata_field(MemoryContextData *cxt, char **ptr, const char *str);
 static void write_console(const char *line, int len);
-static void setup_formatted_log_time(void);
-static void setup_formatted_start_time(void);
 static const char *process_log_prefix_padding(const char *p, int *padding);
 static void log_line_prefix(StringInfo buf, ErrorData *edata);
-static void write_csvlog(ErrorData *edata);
 static void send_message_to_server_log(ErrorData *edata);
-static void write_pipe_chunks(char *data, int len, int dest);
 static void send_message_to_frontend(ErrorData *edata);
-static const char *error_severity(int elevel);
 static void append_with_tabs(StringInfo buf, const char *str);
 
 
@@ -2289,14 +2284,23 @@ write_console(const char *line, int len)
 }
 
 /*
- * setup formatted_log_time, for consistent times between CSV and regular logs
+ * get_formatted_log_time -- compute and get the log timestamp.
+ *
+ * The timestamp is computed if not set yet, so as it is kept consistent
+ * among all the log destinations that require it to be consistent.  Note
+ * that the computed timestamp is returned in a static buffer, not
+ * palloc()'d.
  */
-static void
-setup_formatted_log_time(void)
+char *
+get_formatted_log_time(void)
 {
 	pg_time_t	stamp_time;
 	char		msbuf[13];
 
+	/* leave if already computed */
+	if (formatted_log_time[0] != '\0')
+		return formatted_log_time;
+
 	if (!saved_timeval_set)
 	{
 		gettimeofday(_timeval, NULL);
@@ -2318,16 +2322,34 @@ setup_formatted_log_time(void)
 	/* 'paste' milliseconds into place... */
 	sprintf(msbuf, ".%03d", (int) (saved_timeval.tv_usec / 1000));
 	memcpy(formatted_log_time + 19, msbuf, 4);
+
+	return formatted_log_time;
 }
 
 /*
- * setup formatted_start_time
+ * reset_formatted_start_time -- reset the start timestamp
  */
-static void
-setup_formatted_start_time(void)
+void
+reset_formatted_start_time(void)
+{
+	formatted_start_time[0] = '\0';
+}
+
+/*
+ * get_formatted_start_time -- compute and get the start timestamp.
+ *
+ * The timestamp is computed if not set yet.  Note that the computed
+ * timestamp is returned in a static buffer, not palloc()'d.
+ */
+char *
+get_formatted_start_time(void)
 {
 	pg_time_t	stamp_time = (pg_time_t) MyStartTime;
 
+	/* leave if already computed */
+	if (formatted_start_time[0] != '\0')
+		return formatted_start_time;
+
 	/*
 	 * 

Re: Fixing build of MSVC with OpenSSL 3.0.0

2021-10-19 Thread Michael Paquier
On Tue, Oct 19, 2021 at 10:34:10AM +0200, Daniel Gustafsson wrote:
> I think we can tighten the check for GetOpenSSLVersion() a bit since we now 
> now
> the range of version in the 1.x.x series.  For these checks we know we want
> 1.1.x or 3.x.x, but never 2.x.x etc.
> 
> How about the (untested) attached which encodes that knowledge, as well as 
> dies
> on too old OpenSSL versions?

One assumption hidden behind the scripts of src/tools/msvc/ is that we
have never needed to support OpenSSL <= 1.0.1 these days (see for
example HAVE_X509_GET_SIGNATURE_NID always set to 1, introduced in
1.0.2) because the buildfarm has no need for it and there is no MSI
for this version for years (except if compiling from source, but
nobody would do that for an older version anyway with their right
mind).  If you try, you would already get a compilation failure pretty
quickly.  So I'd rather keep the code as-is and not add the extra
sudden-death check.  Now that's only three extra lines, so..
--
Michael


signature.asc
Description: PGP signature


Re: UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Japin Li


On Tue, 19 Oct 2021 at 17:12, Zhihong Yu  wrote:
> On Tue, Oct 19, 2021 at 12:39 AM Onder Kalaci  wrote:
>
>> Hi hackers,
>>
>>
>>
>> I couldn’t find a similar report to this one, so starting a new thread. I
>> can reproduce this on v14.0 as well as PostgreSQL 12.5 (not tried below
>> versions).
>>
>>
>>
>> Steps to reproduce:
>>
>>
>>
>> CREATE TYPE two_ints as (if1 int, if2 int);
>>
>> CREATE DOMAIN domain AS two_ints CHECK ((VALUE).if1 > 0);
>>
>> CREATE TABLE domain_indirection_test (f1 int, f3 domain, domain_array
>> domain[]);
>>
>> INSERT INTO domain_indirection_test (f1,f3.if1) VALUES (0, 1);
>>
>> UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
>>
>> server closed the connection unexpectedly
>>
>> This probably means the server terminated abnormally
>>
>> before or while processing the request.
>>
>> The connection to the server was lost. Attempting reset: Failed.
>>
>> Time: 3.990 ms
>>
>
>  Hi,
>  It seems the following change would fix the crash:
>
> diff --git a/src/postgres/src/backend/executor/execExprInterp.c
> b/src/postgres/src/backend/executor/execExprInterp.c
> index 622cab9d4..50cb4f014 100644
> --- a/src/postgres/src/backend/executor/execExprInterp.c
> +++ b/src/postgres/src/backend/executor/execExprInterp.c
> @@ -3038,6 +3038,9 @@ ExecEvalFieldStoreDeForm(ExprState *state,
> ExprEvalStep *op, ExprContext *econte
>  HeapTupleHeader tuphdr;
>  HeapTupleData tmptup;
>
> +if (DatumGetPointer(tupDatum) == NULL) {
> +return;
> +}
>  tuphdr = DatumGetHeapTupleHeader(tupDatum);
>  tmptup.t_len = HeapTupleHeaderGetDatumLength(tuphdr);
>  ItemPointerSetInvalid(&(tmptup.t_self));
>
> Here is the result after the update statement:
>
> # UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
> UPDATE 1
> # select * from domain_indirection_test;
>  f1 |  f3  |  domain_array
> +--+
>   0 | (1,) | [0:0]={"(,5)"}
> (1 row)
>

Yeah, it fixes the core dump.

However, When I test the patch, I find the update will replace all data
in `domain` if we only update one field.

postgres=# UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
UPDATE 1
postgres=# select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+
  0 | (1,) | [0:0]={"(,5)"}
(1 row)

postgres=# UPDATE domain_indirection_test SET domain_array[0].if1 = 10;
UPDATE 1
postgres=# select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+-
  0 | (1,) | [0:0]={"(10,)"}
(1 row)


So I try to update all field in `domain`, and find only the last one will
be stored.

postgres=# UPDATE domain_indirection_test SET domain_array[0].if1 = 10, 
domain_array[0].if2 =  5;
UPDATE 1
postgres=# select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+
  0 | (1,) | [0:0]={"(,5)"}
(1 row)

postgres=# UPDATE domain_indirection_test SET domain_array[0].if2 = 10, 
domain_array[0].if1 =  5;
UPDATE 1
postgres=# select * from domain_indirection_test ;
 f1 |  f3  |  domain_array
+--+
  0 | (1,) | [0:0]={"(5,)"}
(1 row)


Does this worked as expected?  For me, For me, I think this is a bug.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-19 Thread Ronan Dunklau
Le lundi 27 septembre 2021, 20:15:05 CEST Mark Dilger a écrit :
> > On Sep 21, 2021, at 12:58 PM, Andrew Dunstan  wrote:
> > 
> > This patch set is failing to apply for me - it fails on patch 2.
> 
> Thanks for looking!  I have pulled together a new patch set which applies
> cleanly against current master.
> > I haven't dug terribly deeply into it yet, but I notice that there is a
> > very large increase in test volume, which appears to account for much of
> > the 44635 lines of the patch set. I think we're probably going to want
> > to reduce that. We've had complaints in the past from prominent hackers
> > about adding too much volume to the regression tests.
> 
> The v8 patch set is much smaller, with the reduction being in the size of
> regression tests covering which roles can perform SET, RESET, ALTER SYSTEM
> SET, and ALTER SYSTEM RESET and on which GUCs.  The v7 patch set did
> exhaustive testing on this, which is why it was so big.  The v8 set does
> just a sampling of GUCs per role.  The total number of lines for the patch
> set drops from 44635 to 13026, with only 1960 lines total between the .sql
> and .out tests of GUC privileges.
> > I do like the basic thrust of reducing the power of CREATEROLE. There's
> > an old legal maxim I learned in my distant youth that says "nemo dat
> > quod non habet" - Nobody can give something they don't own. This seems
> > to be in that spirit, and I approve :-)
> 
> Great!  I'm glad to hear the approach has some support.
> 
> 
> Other changes in v8:
> 
> Add a new test for subscriptions owned by non-superusers to verify that the
> tablesync and apply workers replicating their subscription won't replicate
> into schemas and tables that the subscription owner lacks privilege to
> touch.  The logic to prevent that existed in the v7 patch, but I overlooked
> adding tests for it.  Fixed.
> 
> Allow non-superusers to create event triggers.  The logic already existed
> and is unchanged to handle event triggers owned by non-superusers and
> conditioning those triggers firing on the (trigger-owner,
> role-performing-event) pair.  The v7 patch set didn't go quite so far as
> allowing non-superusers to create event triggers, but that undercuts much
> of the benefit of the changes for no obvious purpose.
> 
> 
> Not changed in v8, but worth discussing:
> 
> Non-superusers are still prohibited from creating subscriptions, despite
> improvements to the security around that circumstance.  Improvements to the
> security model around event triggers does not have to also include
> permission for non-superuser to create event triggers, but v8 does both. 
> These could be viewed as inconsistent choices, but I struck the balance
> this way because roles creating event triggers does not entail them doing
> anything that they can't already do, whereas allowing arbitrary users to
> create subscriptions would entail an ordinary user causing external network
> connections being initiated.  We likely need to create another privileged
> role and require a non-superuser to be part of that role before they can
> create subscriptions.  That seems, however, like something to do as a
> follow-on patch, since tightening up the security on subscriptions as done
> in this patch doesn't depend on that.

The changes proposed around subscription management make a lot of sense, 
especially considering that now that we don't allow to run ALTER SUBSCRIPTION 
REFRESH in a function anymore, there is no way to delegate this to a non 
superuser (using a security definer function). Since it doesn't involve the 
rest of the patchset, patches 16, 17 and 18 could be separated in another 
thread / patchset. 

-- 
Ronan Dunklau






Re: [PATCH] Prefer getenv("HOME") to find the UNIX home directory

2021-10-19 Thread Anders Kaseorg

On 10/19/21 01:34, Kyotaro Horiguchi wrote:

I tend to agree to this, but seeing ssh ignoring $HOME, I'm not sure
it's safe that we follow the variable at least when accessing
confidentiality(?) files.  Since I don't understand the exact
reasoning for the ssh's behavior so it's just my humbole opinion.


According to https://bugzilla.mindrot.org/show_bug.cgi?id=3048#c1, it 
used to be supported to install the ssh binary as setuid.  A 
setuid/setgid binary needs to treat all environment variables with 
suspicion: if it can be convinced to write a file to $HOME with root 
privileges, then a user who modifies $HOME before invoking the binary 
could cause it to write to a file that the user normally couldn’t.


There’s no such concern for a binary that isn’t setuid/setgid.  Anyone 
with the ability to modify $HOME can be assumed to already have full 
control of the user account.


Anders




Re: UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Zhihong Yu
On Tue, Oct 19, 2021 at 12:39 AM Onder Kalaci  wrote:

> Hi hackers,
>
>
>
> I couldn’t find a similar report to this one, so starting a new thread. I
> can reproduce this on v14.0 as well as PostgreSQL 12.5 (not tried below
> versions).
>
>
>
> Steps to reproduce:
>
>
>
> CREATE TYPE two_ints as (if1 int, if2 int);
>
> CREATE DOMAIN domain AS two_ints CHECK ((VALUE).if1 > 0);
>
> CREATE TABLE domain_indirection_test (f1 int, f3 domain, domain_array
> domain[]);
>
> INSERT INTO domain_indirection_test (f1,f3.if1) VALUES (0, 1);
>
> UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
>
> server closed the connection unexpectedly
>
> This probably means the server terminated abnormally
>
> before or while processing the request.
>
> The connection to the server was lost. Attempting reset: Failed.
>
> Time: 3.990 ms
>
> @:-!>
>
>
>
>
>
> The backtrace on PG 12.5 (As far as I remember, PG14 looks very similar) :
>
>
>
> (lldb) bt
>
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
> (code=1, address=0x0)
>
>   * frame #0: 0x0001036584b7
> postgres`pg_detoast_datum(datum=0x) at fmgr.c:1741:6
>
> frame #1: 0x000103439a86
> postgres`ExecEvalFieldStoreDeForm(state=,
> op=0x7f9212045df8, econtext=) at execExprInterp.c:3025:12
>
> frame #2: 0x00010343834e
> postgres`ExecInterpExpr(state=, econtext=,
> isnull=0x7ffeec91fdc7) at execExprInterp.c:1337:4
>
> frame #3: 0x00010343742b
> postgres`ExecInterpExprStillValid(state=0x7f921181db18,
> econtext=0x7f921181d670, isNull=) at
> execExprInterp.c:1778:9
>
> frame #4: 0x000103444e0d
> postgres`ExecEvalExprSwitchContext(state=0x7f921181db18,
> econtext=0x7f921181d670, isNull=) at executor.h:310:13
>
> frame #5: 0x000103444cf0
> postgres`ExecProject(projInfo=0x7f921181db10) at executor.h:344:9
>
> frame #6: 0x000103444af6
> postgres`ExecScan(node=0x7f921181d560, accessMtd=(postgres`SeqNext at
> nodeSeqscan.c:51), recheckMtd=(postgres`SeqRecheck at nodeSeqscan.c:90)) at
> execScan.c:239:12
>
> frame #7: 0x000103461d17
> postgres`ExecSeqScan(pstate=) at nodeSeqscan.c:112:9
>
> frame #8: 0x00010344375c
> postgres`ExecProcNodeFirst(node=0x7f921181d560) at execProcnode.c:445:9
>
> frame #9: 0x00010345eefe
> postgres`ExecProcNode(node=0x7f921181d560) at executor.h:242:9
>
> frame #10: 0x00010345e74f
> postgres`ExecModifyTable(pstate=0x7f921181d090) at
> nodeModifyTable.c:2079:14
>
> frame #11: 0x00010344375c
> postgres`ExecProcNodeFirst(node=0x7f921181d090) at execProcnode.c:445:9
>
> frame #12: 0x00010343f25e
> postgres`ExecProcNode(node=0x7f921181d090) at executor.h:242:9
>
> frame #13: 0x00010343d80d
> postgres`ExecutePlan(estate=0x7f921181cd10,
> planstate=0x7f921181d090, use_parallel_mode=,
> operation=CMD_UPDATE, sendTuples=false, numberTuples=0,
> direction=ForwardScanDirection, dest=0x7f9211818c38,
> execute_once=) at execMain.c:1646:10
>
> frame #14: 0x00010343d745
> postgres`standard_ExecutorRun(queryDesc=0x7f921180e310,
> direction=ForwardScanDirection, count=0, execute_once=true) at
> execMain.c:364:3
>
> frame #15: 0x00010343d67c
> postgres`ExecutorRun(queryDesc=0x7f921180e310,
> direction=ForwardScanDirection, count=0, execute_once=) at
> execMain.c:308:3
>
> frame #16: 0x0001035784a8
> postgres`ProcessQuery(plan=, sourceText=,
> params=, queryEnv=0x, dest=,
> completionTag="") at pquery.c:161:2
>
> frame #17: 0x000103577c5e
> postgres`PortalRunMulti(portal=0x7f9215024110, isTopLevel=true,
> setHoldSnapshot=false, dest=0x7f9211818c38, altdest=0x7f9211818c38,
> completionTag="") at pquery.c:0
>
> frame #18: 0x00010357763d
> postgres`PortalRun(portal=0x7f9215024110, count=9223372036854775807,
> isTopLevel=, run_once=, dest=0x7f9211818c38,
> altdest=0x7f9211818c38, completionTag="") at pquery.c:796:5
>
> frame #19: 0x000103574f87
> postgres`exec_simple_query(query_string="UPDATE domain_indirection_test SET
> domain_array[0].if2 = 5;") at postgres.c:1215:10
>
> frame #20: 0x0001035746b8
> postgres`PostgresMain(argc=, argv=,
> dbname=, username=) at postgres.c:0
>
> frame #21: 0x00010350d712 postgres`BackendRun(port=)
> at postmaster.c:4494:2
>
> frame #22: 0x00010350cffa
> postgres`BackendStartup(port=) at postmaster.c:4177:3
>
> frame #23: 0x00010350c59c postgres`ServerLoop at
> postmaster.c:1725:7
>
> frame #24: 0x00010350ac8d postgres`PostmasterMain(argc=3,
> argv=0x7f9210d049c0) at postmaster.c:1398:11
>
> frame #25: 0x00010347fbdd postgres`main(argc=,
> argv=) at main.c:228:3
>
> frame #26: 0x7fff204e8f3d libdyld.dylib`start + 1
>
>
>
>
>
> Thanks,
>
> Onder KALACI
>
> Software Engineer at Microsoft &
>
> Developing the Citus database extension for PostgreSQL

Re: Error "initial slot snapshot too large" in create replication slot

2021-10-19 Thread Dilip Kumar
On Tue, Oct 12, 2021 at 11:30 AM Dilip Kumar  wrote:
>
> On Tue, Oct 12, 2021 at 10:35 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Tue, 12 Oct 2021 13:59:59 +0900 (JST), Kyotaro Horiguchi 
> >  wrote in
> > > So I came up with the attached version.
> >
> > Sorry, it was losing a piece of change.
>
> Yeah, at a high level this is on the idea I have in mind, I will do a
> detailed review in a day or two.  Thanks for working on this.

While doing the detailed review, I think there are a couple of
problems with the patch,  the main problem of storing all the xid in
the snap->subxip is that once we mark the snapshot overflown then the
XidInMVCCSnapshot, will not search the subxip array, instead it will
fetch the topXid and search in the snap->xip array.  Another issue is
that the total xids could be GetMaxSnapshotSubxidCount()
+GetMaxSnapshotXidCount().

I think what we should be doing is that if the xid is know subxid then
add in the snap->subxip array otherwise in snap->xip array.  So
snap->xip array size will be GetMaxSnapshotXidCount() whereas the
snap->subxip array size will be GetMaxSnapshotSubxidCount().  And if
the array size is full then we can stop adding the subxids to the
array.

What is your thought on this?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Prefer getenv("HOME") to find the UNIX home directory

2021-10-19 Thread Kyotaro Horiguchi
At Mon, 18 Oct 2021 19:23:50 -0300, Alvaro Herrera  
wrote in 
> On 2021-Oct-14, Anders Kaseorg wrote:
> 
> > This is important for systems where many users share the same UID, and
> > for test systems that change HOME to avoid interference with the
> > user’s real home directory.  It matches what most applications do, as
> > well as what glibc does for glob("~", GLOB_TILDE, …) and wordexp("~",
> > …).
> > 
> > There was some previous discussion of this in 2016, where although
> > there were some questions about the use case, there seemed to be
> > general support for the concept:
> > 
> > https://www.postgresql.org/message-id/flat/CAEH6cQqbdbXoUHJBbX9ixwfjFFsUC-a8hFntKcci%3DdiWgBb3fQ%40mail.gmail.com
> 
> I think modifying $HOME is a strange way to customize things, but given
> how widespread it is [claimed to be] today, it seems reasonable to do
> things that way.

I tend to agree to this, but seeing ssh ignoring $HOME, I'm not sure
it's safe that we follow the variable at least when accessing
confidentiality(?) files.  Since I don't understand the exact
reasoning for the ssh's behavior so it's just my humbole opinion.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Fixing build of MSVC with OpenSSL 3.0.0

2021-10-19 Thread Daniel Gustafsson
> On 19 Oct 2021, at 07:27, Michael Paquier  wrote:

> Looking at the MSIs of OpenSSL for Win64 and Win32, there are no
> changes in the deliverable names or paths, meaning that something as
> simple as the attached patch is enough to make the build pass.

Makes sense.

> Any opinions?

I think we can tighten the check for GetOpenSSLVersion() a bit since we now now
the range of version in the 1.x.x series.  For these checks we know we want
1.1.x or 3.x.x, but never 2.x.x etc.

How about the (untested) attached which encodes that knowledge, as well as dies
on too old OpenSSL versions?

--
Daniel Gustafsson   https://vmware.com/



openssl_win23.diff
Description: Binary data


UPDATE on Domain Array that is based on a composite key crashes

2021-10-19 Thread Onder Kalaci
Hi hackers,

I couldn’t find a similar report to this one, so starting a new thread. I can 
reproduce this on v14.0 as well as PostgreSQL 12.5 (not tried below versions).

Steps to reproduce:

CREATE TYPE two_ints as (if1 int, if2 int);
CREATE DOMAIN domain AS two_ints CHECK ((VALUE).if1 > 0);
CREATE TABLE domain_indirection_test (f1 int, f3 domain, domain_array domain[]);
INSERT INTO domain_indirection_test (f1,f3.if1) VALUES (0, 1);
UPDATE domain_indirection_test SET domain_array[0].if2 = 5;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 3.990 ms
@:-!>


The backtrace on PG 12.5 (As far as I remember, PG14 looks very similar) :

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
(code=1, address=0x0)
  * frame #0: 0x0001036584b7 
postgres`pg_detoast_datum(datum=0x) at fmgr.c:1741:6
frame #1: 0x000103439a86 
postgres`ExecEvalFieldStoreDeForm(state=, op=0x7f9212045df8, 
econtext=) at execExprInterp.c:3025:12
frame #2: 0x00010343834e postgres`ExecInterpExpr(state=, 
econtext=, isnull=0x7ffeec91fdc7) at execExprInterp.c:1337:4
frame #3: 0x00010343742b 
postgres`ExecInterpExprStillValid(state=0x7f921181db18, 
econtext=0x7f921181d670, isNull=) at execExprInterp.c:1778:9
frame #4: 0x000103444e0d 
postgres`ExecEvalExprSwitchContext(state=0x7f921181db18, 
econtext=0x7f921181d670, isNull=) at executor.h:310:13
frame #5: 0x000103444cf0 
postgres`ExecProject(projInfo=0x7f921181db10) at executor.h:344:9
frame #6: 0x000103444af6 postgres`ExecScan(node=0x7f921181d560, 
accessMtd=(postgres`SeqNext at nodeSeqscan.c:51), 
recheckMtd=(postgres`SeqRecheck at nodeSeqscan.c:90)) at execScan.c:239:12
frame #7: 0x000103461d17 postgres`ExecSeqScan(pstate=) at 
nodeSeqscan.c:112:9
frame #8: 0x00010344375c 
postgres`ExecProcNodeFirst(node=0x7f921181d560) at execProcnode.c:445:9
frame #9: 0x00010345eefe postgres`ExecProcNode(node=0x7f921181d560) 
at executor.h:242:9
frame #10: 0x00010345e74f 
postgres`ExecModifyTable(pstate=0x7f921181d090) at nodeModifyTable.c:2079:14
frame #11: 0x00010344375c 
postgres`ExecProcNodeFirst(node=0x7f921181d090) at execProcnode.c:445:9
frame #12: 0x00010343f25e 
postgres`ExecProcNode(node=0x7f921181d090) at executor.h:242:9
frame #13: 0x00010343d80d 
postgres`ExecutePlan(estate=0x7f921181cd10, planstate=0x7f921181d090, 
use_parallel_mode=, operation=CMD_UPDATE, sendTuples=false, 
numberTuples=0, direction=ForwardScanDirection, dest=0x7f9211818c38, 
execute_once=) at execMain.c:1646:10
frame #14: 0x00010343d745 
postgres`standard_ExecutorRun(queryDesc=0x7f921180e310, 
direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:364:3
frame #15: 0x00010343d67c 
postgres`ExecutorRun(queryDesc=0x7f921180e310, 
direction=ForwardScanDirection, count=0, execute_once=) at 
execMain.c:308:3
frame #16: 0x0001035784a8 postgres`ProcessQuery(plan=, 
sourceText=, params=, queryEnv=0x, 
dest=, completionTag="") at pquery.c:161:2
frame #17: 0x000103577c5e 
postgres`PortalRunMulti(portal=0x7f9215024110, isTopLevel=true, 
setHoldSnapshot=false, dest=0x7f9211818c38, altdest=0x7f9211818c38, 
completionTag="") at pquery.c:0
frame #18: 0x00010357763d postgres`PortalRun(portal=0x7f9215024110, 
count=9223372036854775807, isTopLevel=, run_once=, 
dest=0x7f9211818c38, altdest=0x7f9211818c38, completionTag="") at 
pquery.c:796:5
frame #19: 0x000103574f87 
postgres`exec_simple_query(query_string="UPDATE domain_indirection_test SET 
domain_array[0].if2 = 5;") at postgres.c:1215:10
frame #20: 0x0001035746b8 postgres`PostgresMain(argc=, 
argv=, dbname=, username=) at 
postgres.c:0
frame #21: 0x00010350d712 postgres`BackendRun(port=) at 
postmaster.c:4494:2
frame #22: 0x00010350cffa postgres`BackendStartup(port=) 
at postmaster.c:4177:3
frame #23: 0x00010350c59c postgres`ServerLoop at postmaster.c:1725:7
frame #24: 0x00010350ac8d postgres`PostmasterMain(argc=3, 
argv=0x7f9210d049c0) at postmaster.c:1398:11
frame #25: 0x00010347fbdd postgres`main(argc=, 
argv=) at main.c:228:3
frame #26: 0x7fff204e8f3d libdyld.dylib`start + 1


Thanks,
Onder KALACI
Software Engineer at Microsoft &
Developing the Citus database extension for PostgreSQL



Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()

2021-10-19 Thread Kyotaro Horiguchi
(This branch may should leave from this thread..)

At Fri, 15 Oct 2021 15:00:57 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Thu, 14 Oct 2021 10:53:06 -0700, Andres Freund  wrote 
> in 
> > This'd get rid of the need of density *and* make SIInsertDataEntries()
> > cheaper.
> 
> Yes.  So.. I tried that. The only part where memory-flush timing is
> crucial seems to be between writing messages and setting maxMsgNum.
> By placing memory barrier between them it seems *to me* we can read
> maxMsgNum safely without locks.

Maybe we need another memory barrier here and the patch was broken
about the rechecking on the members in GetSIGetDataEntries..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Partial aggregates pushdown

2021-10-19 Thread Alexander Pyhalov

Hi.

Tomas Vondra писал 2021-10-15 17:56:

As for the proposed approach, it's probably good enough for the first
version to restrict this to aggregates where the aggregate result is
sufficient, i.e. we don't need any new export/import procedures.

But it's very unlikely we'd want to restrict it the way the patch does
it, i.e. based on aggregate name. That's both fragile (people can
create new aggregates with such name) and against the PostgreSQL
extensibility (people may implement custom aggregates, but won't be
able to benefit from this just because of name).

So for v0 maybe, but I think there neeeds to be a way to relax this in
some way, for example we could add a new flag to pg_aggregate to mark
aggregates supporting this.



Updated patch to mark aggregates as pushdown-safe in pg_aggregates.

So far have no solution for aggregates with internal aggtranstype.
--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom 823a389caf003a21dd4c8e758f89d08ba89c5856 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Thu, 14 Oct 2021 17:30:34 +0300
Subject: [PATCH] Partial aggregates push down

---
 contrib/postgres_fdw/deparse.c|  45 +++-
 .../postgres_fdw/expected/postgres_fdw.out| 215 +-
 contrib/postgres_fdw/postgres_fdw.c   |  29 ++-
 contrib/postgres_fdw/sql/postgres_fdw.sql |  31 ++-
 src/backend/catalog/pg_aggregate.c|   4 +-
 src/backend/commands/aggregatecmds.c  |   6 +-
 src/include/catalog/catversion.h  |   2 +-
 src/include/catalog/pg_aggregate.dat  | 101 
 src/include/catalog/pg_aggregate.h|   6 +-
 9 files changed, 362 insertions(+), 77 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd666818..cf6b2d9f066 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -196,6 +196,7 @@ static bool is_subquery_var(Var *node, RelOptInfo *foreignrel,
 static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
 		  int *relno, int *colno);
 
+static bool partial_agg_ok(Aggref *agg);
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
@@ -831,8 +832,10 @@ foreign_expr_walker(Node *node,
 if (!IS_UPPER_REL(glob_cxt->foreignrel))
 	return false;
 
-/* Only non-split aggregates are pushable. */
-if (agg->aggsplit != AGGSPLIT_SIMPLE)
+if ((agg->aggsplit != AGGSPLIT_SIMPLE) && (agg->aggsplit != AGGSPLIT_INITIAL_SERIAL))
+	return false;
+
+if (agg->aggsplit == AGGSPLIT_INITIAL_SERIAL && !partial_agg_ok(agg))
 	return false;
 
 /* As usual, it must be shippable. */
@@ -3249,7 +3252,7 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	bool		use_variadic;
 
 	/* Only basic, non-split aggregation accepted. */
-	Assert(node->aggsplit == AGGSPLIT_SIMPLE);
+	Assert(node->aggsplit == AGGSPLIT_SIMPLE || node->aggsplit == AGGSPLIT_INITIAL_SERIAL);
 
 	/* Check if need to print VARIADIC (cf. ruleutils.c) */
 	use_variadic = node->aggvariadic;
@@ -3719,3 +3722,39 @@ get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
 	/* Shouldn't get here */
 	elog(ERROR, "unexpected expression in subquery output");
 }
+
+/*
+ * Check that partial aggregate agg is fine to push down
+ */
+static bool
+partial_agg_ok(Aggref *agg)
+{
+	HeapTuple	aggtup;
+	Form_pg_aggregate aggform;
+
+	Assert(agg->aggsplit == AGGSPLIT_INITIAL_SERIAL);
+
+	/* We don't support complex partial aggregates */
+	if (agg->aggdistinct || agg->aggvariadic || agg->aggkind != AGGKIND_NORMAL || agg->aggorder != NIL)
+		return false;
+
+	/* Can't process aggregates which require serialization/deserialization */
+	if (agg->aggtranstype == INTERNALOID)
+		return false;
+
+	aggtup = SearchSysCache1(AGGFNOID, ObjectIdGetDatum(agg->aggfnoid));
+	if (!HeapTupleIsValid(aggtup))
+		elog(ERROR, "cache lookup failed for function %u", agg->aggfnoid);
+	aggform = (Form_pg_aggregate) GETSTRUCT(aggtup);
+
+	/* Only aggregates, marked as pushdown safe, are allowed */
+	if (aggform->aggpartialpushdownsafe != true)
+	{
+		ReleaseSysCache(aggtup);
+		return false;
+	}
+
+	ReleaseSysCache(aggtup);
+
+	return true;
+}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 44c4367b8f9..89451e208e0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9279,13 +9279,13 @@ RESET enable_partitionwise_join;
 -- ===
 -- test partitionwise aggregates
 -- ===
-CREATE TABLE pagg_tab (a int, b int, c text) PARTITION BY RANGE(a);
+CREATE TABLE pagg_tab (a int, b int, c text, d numeric) PARTITION BY RANGE(a);
 CREATE TABLE pagg_tab_p1 (LIKE pagg_tab);
 CREATE TABLE pagg_tab_p2 (LIKE pagg_tab);
 CREATE TABLE pagg_tab_p3 (LIKE pagg_tab);
-INSERT INTO 

Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable

2021-10-19 Thread Matthijs van der Vleuten
On Mon, Oct 18, 2021, at 09:46, Michael Paquier wrote:
> On Thu, Oct 14, 2021 at 11:07:21AM +0900, Michael Paquier wrote:
> Attached is the patch I am finishing with, that should go down to
> v13 (this is going to conflict on REL_13_STABLE, for sure).
>
> Thoughts?

The test case doesn't seem entirely correct to me? The index being dropped 
(btree_tall_tbl_idx2) doesn't exist.

Also, I don't believe this tests the case of dropping the index when it 
previously has been altered in this way.




Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2021-10-19 Thread Kyotaro Horiguchi
At Tue, 19 Oct 2021 02:45:24 +, "osumi.takami...@fujitsu.com" 
 wrote in 
> On Thursday, October 14, 2021 11:21 AM Kyotaro Horiguchi 
>  wrote:
> > It adds a call to ReorderBufferAssignChild but usually subtransactions are
> > assigned to top level elsewherae.  Addition to that
> > ReorderBufferCommitChild() called just later does the same thing.  We are
> > adding the third call to the same function, which looks a bit odd.
> It can be odd. However, we
> have a check at the top of ReorderBufferAssignChild
> to judge if the sub transaction is already associated or not
> and skip the processings if it is.

My question was why do we need to make the extra call to
ReorerBufferCommitChild when XACT_XINFO_HAS_INVALS in spite of the
existing call to the same fuction that unconditionally made.  It
doesn't cost so much but also it's not free.

> > And I'm not sure it is wise to mark all subtransactions as "catalog changed"
> > always when the top transaction is XACT_XINFO_HAS_INVALS.
> In order to avoid this,
> can't we have a new flag (for example, in reorderbuffer struct) to check
> if we start decoding from RUNNING_XACTS, which is similar to the first patch 
> of [1]
> and use it at DecodeCommit ? This still leads to some extra specific codes 
> added
> to DecodeCommit and this solution becomes a bit similar to other previous 
> patches though.

If it is somehow wrong in any sense that we add subtransactions in
SnapBuildProcessRunningXacts (for example, we should avoid relaxing
the assertion condition.), I think we would go another way.  Otherwise
we don't even need that additional flag.  (But Sawadasan's recent PoC
also needs that relaxation.)

ASAICS, and unless I'm missing something (that odds are rtlatively
high:p), we need the specially added subransactions only for the
transactions that were running at passing the first RUNNING_XACTS,
becuase otherwise (substantial) subtransactions are assigned to
toplevel by the first record of the subtransaction.

Before reaching consistency, DecodeCommit feeds the subtransactions to
ReorderBufferForget individually so the subtransactions are not needed
to be assigned to the top transaction at all. Since the
subtransactions added by the first RUNNING_XACT are processed that
way, we don't need in the first place to call ReorderBufferCommitChild
for such subtransactions.

> [1] - 
> https://www.postgresql.org/message-id/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?

2021-10-19 Thread Bharath Rupireddy
On Tue, Oct 19, 2021 at 6:16 AM Michael Paquier  wrote:
>
> On Mon, Oct 18, 2021 at 09:56:41AM -0400, Andrew Dunstan wrote:
> > Well, see
> > 
> >
> > Maybe we should move that section.
>
> As this is the part of the docs where we document the builds, it
> looks indeed a bit confusing to have all the requirements for the
> TAP tests there.  The section "Regression Tests" cannot be used for
> the case of VS, and the section for TAP is independent of that so we
> could use platform-dependent sub-sections.
>
> Could it be better to move all the contents of "Running the Regression
> Tests" from the Windows installation page to the section of
> "Regression Tests" instead?  That would mean spreading the knowledge
> of vcregress.pl to more than one place, though.

IMO, it is better to add a note in the "Running the Tests" section at
[1] and a link to the windows specific section at [2]. This will keep
all the windows specific things at one place without any duplication
of vcregress.pl knowledge. Thoughts? If okay, I will send a patch.

[1] https://www.postgresql.org/docs/devel/regress-run.html
[2] 
https://www.postgresql.org/docs/devel/install-windows-full.html#id-1.6.5.8.12

Regards,
Bharath Rupireddy.




Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?

2021-10-19 Thread Bharath Rupireddy
On Mon, Oct 18, 2021 at 6:19 PM Andrew Dunstan  wrote:
> > Thanks for your opinion. IIUC, the subscription tests can be run with
> > setting environment variables PROVE_FLAGS, PROVE_TESTS and the
> > "vcregress taptest" command right? I failed to set the environment
> > variables appropriately and couldn't run. Can you please let me know
> > the right way to run the test?
>
> No extra environment flags should be required for MSVC.
>
> This should suffice:
>
> vcregress taptest src/test/subscription

Wow! This is so simple that I imagined.

> If you want to set PROVE_FLAGS the simplest thing is just to set it in
> the environment before the above invocation

Okay.

> > If any test can be run with a set of environment flags and "vcregress
> > taptest" command, then in the first place, it doesn't make sense to
> > have recoverycheck, upgragecheck and so on. Another thing is that the
> > list of "vcregress" commands cover almost all the tests core, tap,
> > bin, isolation, contrib tests except, subscription tests. If we add
> > "vcregress subscrtptioncheck", the list of tests that can be run with
> > the "vcregress" command will be complete without having to depend on
> > the environment variables.
>
> The reason we have some of those other tests is because we didn't start
> with having a generic taptest command in vcregress.pl.  So they are
> simply legacy code. But that is no reason for adding to them.

I get it, thanks.

> > IMHO, we can have "vcregress subscriptioncheck" to make it complete
> > and easier for the user to run those tests. However, let's hear what
> > other hackers have to say about this.
>
> I really fail to see how the invocation above is in any sense
> significantly more complicated.

Yes, the command "vcregress taptest src/test/subscription" is simple enough.

> > Another thing I noticed is that we don't have any mention of
> > "vcregress taptest" command in the docs [1], if I read the docs
> > correctly. How about we have it along with a  sample example on how to
> > run a specific TAP tests with it in the docs?
> >
> > [1] - https://www.postgresql.org/docs/current/install-windows-full.html
>
> Yes, that's probably something that should be remedied.

Yes, I will prepare a patch for it.

Regards,
Bharath Rupireddy.