date:20230704

Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

2023-07-04 Thread Amit Kapila

On Wed, Jul 5, 2023 at 9:01 AM Peter Smith  wrote:
>
> Hi. Here are some review comments for this patch.
>
> +1 for the patch idea.
>
> --
>
> I wasn't sure about the code comment adjustments suggested by Amit [1]:
> "Accordingly, the comments atop build_replindex_scan_key(),
> FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
> should also be adjusted."
>
> Actually, I thought the FindUsableIndexForReplicaIdentityFull()
> function comment is *already* describing the limitation about the
> leftmost column (see fragment below), so IIUC the Sawada-san patch is
> only trying to expose that same information in the PG docs.
>
> [FindUsableIndexForReplicaIdentityFull comment fragment]
>  * We also skip indexes if the remote relation does not contain the leftmost
>  * column of the index. This is because in most such cases sequential scan is
>  * favorable over index scan.
>

This implies that the leftmost column of the index must be
non-expression but I feel what the patch intends to say in docs is
more straightforward and it doesn't match what the proposed docs says.

> ~
>
> OTOH, it may be better if these limitation rule details were not
> scattered in the code. e.g. build_replindex_scan_key() function
> comment can be simplified:
>
> CURRENT:
>  * This is not generic routine, it expects the idxrel to be a btree, 
> non-partial
>  * and have at least one column reference (i.e. cannot consist of only
>  * expressions).
>
> SUGGESTION:
> This is not a generic routine. It expects the 'idxrel' to be an index
> deemed "usable" by the function
> FindUsableIndexForReplicaIdentityFull().
>

Note that for PK/ReplicaIdentity, we don't even call
FindUsableIndexForReplicaIdentityFull() but build_replindex_scan_key()
would still be called for such index. So, I am not sure your proposed
wording is an improvement.

> --
> doc/src/sgml/logical-replication.sgml
>
> 1.
> the key.  When replica identity FULL is specified,
> indexes can be used on the subscriber side for searching the rows.
> Candidate
> indexes must be btree, non-partial, and have at least one column reference
> -   (i.e. cannot consist of only expressions).  These restrictions
> -   on the non-unique index properties adhere to some of the restrictions that
> -   are enforced for primary keys.  If there are no such suitable indexes,
> +   at the leftmost column indexes (i.e. cannot consist of only
> expressions).  These
> +   restrictions on the non-unique index properties adhere to some of
> the restrictions
> +   that are enforced for primary keys.  If there are no such suitable 
> indexes,
> the search on the subscriber side can be very inefficient, therefore
> replica identity FULL should only be used as a
> fallback if no other solution is possible.  If a replica identity other
>
> Isn't this using the word "indexes" with different meanings in the
> same sentence? e.g. IIUC "leftmost column indexes" is referring to the
> ordinal number of the index fields. TBH, I am not sure the patch
> wording is even describing the limitation in quite the same way as
> what the code is actually doing.
>
> HEAD (code comment):
>  * We also skip indexes if the remote relation does not contain the leftmost
>  * column of the index. This is because in most such cases sequential scan is
>  * favorable over index scan.
>
> HEAD (rendered docs)
> Candidate indexes must be btree, non-partial, and have at least one
> column reference (i.e. cannot consist of only expressions). These
> restrictions on the non-unique index properties adhere to some of the
> restrictions that are enforced for primary keys.
>
> PATCHED (rendered docs)
> Candidate indexes must be btree, non-partial, and have at least one
> column reference at the leftmost column indexes (i.e. cannot consist
> of only expressions). These restrictions on the non-unique index
> properties adhere to some of the restrictions that are enforced for
> primary keys.
>
> MY SUGGESTION:
> Candidate indexes must be btree, non-partial, and have at least one
> column reference (i.e. cannot consist of only expressions).
> Furthermore, the leftmost field of the candidate index must be a
> column of the published table. These restrictions on the non-unique
> index properties adhere to some of the restrictions that are enforced
> for primary keys.
>

I don't know if this suggestion is what the code is actually doing. In
function RemoteRelContainsLeftMostColumnOnIdx(), we have the following
checks:
==
keycol = indexInfo->ii_IndexAttrNumbers[0];
if (!AttributeNumberIsValid(keycol))
return false;

if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
return false;

return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
==

The first of these checks indicates that the leftmost column of the
index should be non-expression, second and third indicates what you
suggest in your wording. We can also think that what you wrote in a
way is a superset of "leftmost index

Re: Clean up command argument assembly

2023-07-04 Thread Peter Eisentraut


On 04.07.23 14:14, Heikki Linnakangas wrote:

On 26/06/2023 12:33, Peter Eisentraut wrote:

This is a small code cleanup patch.

Several commands internally assemble command lines to call other
commands.  This includes initdb, pg_dumpall, and pg_regress.  (Also
pg_ctl, but that is different enough that I didn't consider it here.)
This has all evolved a bit organically, with fixed-size buffers, and
various optional command-line arguments being injected with
confusing-looking code, and the spacing between options handled in
inconsistent ways.  This patch cleans all this up a bit to look clearer
and be more easily extensible with new arguments and options.


+1


committed


We start each command with printfPQExpBuffer(), and then append
arguments as necessary with appendPQExpBuffer().  Also standardize on
using initPQExpBuffer() over createPQExpBuffer() where possible.
pg_regress uses StringInfo instead of PQExpBuffer, but many of the
same ideas apply.


It's a bit bogus to use PQExpBuffer for these. If you run out of memory, 
you silently get an empty string instead. StringInfo, which exits the 
process on OOM, would be more appropriate. We have tons of such 
inappropriate uses of PQExpBuffer in all our client programs, though, so 
I don't insist on fixing this particular case right now.


Interesting point.  But as you say better dealt with as a separate problem.

Re: Add more sanity checks around callers of changeDependencyFor()

2023-07-04 Thread Michael Paquier

On Tue, Jul 04, 2023 at 02:40:04PM -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
>> Hmm, shouldn't we disallow moving the function to another schema, if the
>> function's schema was originally determined at extension creation time?
>> I'm not sure we really want to allow moving objects of an extension to a
>> different schema.
> 
> Why not?  I do not believe that an extension's objects are required
> to all be in the same schema.

Yes, I don't see what we would gain by putting restrictions regarding
which schema an object is located in, depending on which schema an
extension uses.
--
Michael


signature.asc
Description: PGP signature

Re: logicalrep_message_type throws an error

2023-07-04 Thread Amit Kapila

On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira  wrote:
>
> On Mon, Jul 3, 2023, at 7:30 AM, Ashutosh Bapat wrote:
>
> logicalrep_message_type() is used to convert logical message type code
> into name while prepared error context or details. Thus when this
> function is called probably an ERROR is already raised. If
> logicalrep_message_type() gets an unknown message type, it will throw
> an error, which will suppress the error for which we are building
> context or details. That's not useful. I think instead
> logicalrep_message_type() should return "unknown" when it encounters
> an unknown message type and let the original error message be thrown
> as is.
>
>
> Hmm. Good catch. The current behavior is:
>
> ERROR:  invalid logical replication message type "X"
> LOG:  background worker "logical replication worker" (PID 71800) exited with 
> exit code 1
>
> ... that hides the details. After providing a default message type:
>
> ERROR:  invalid logical replication message type "X"
> CONTEXT:  processing remote data for replication origin "pg_16638" during 
> message type "???" in transaction 796, finished at 0/16266F8
>

I think after returning "???" from logicalrep_message_type(), the
above is possible when we get the error: "invalid logical replication
message type "X"" from apply_dispatch(), right? If so, then what about
the case when we forgot to handle some message in
logicalrep_message_type() but handled it in apply_dispatch()? Isn't it
better to return the 'action' from the function
logicalrep_message_type() for unknown type? That way the information
could be a bit better and we can easily catch the code bug as well.

-- 
With Regards,
Amit Kapila.

pg_upgrade and cross-library upgrades

2023-07-04 Thread Michael Paquier

Hi all,

After removing --with-openssl from its build of HEAD, snapper has
begun failing in the pg_upgrade path 11->HEAD, because it attempts
pg_upgrade from binaries that have OpenSSL to builds without it:
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=snapper=HEAD

Using the TAP tests of pg_upgrade, I can get the same failure with the
following steps:
1) Setup instance based on Postgres 11, compiled with OpenSSL.
2) Run a few tests and tap a dump:
# From 11 source tree:
make installcheck
cd contrib/pgcrypto/
USE_MODULE_DB=1 make installcheck
~/path/to/11/bin/pg_dumpall -f /tmp/olddump.sql
3) From 16~ source tree, compiled without OpenSSL:
cd src/bin/pg_upgrade
olddump=/tmp/olddump.sql oldinstall=~/path/to/11/ make check

And then you would get:
could not load library "$libdir/pgcrypto": ERROR:  could not access
file "$libdir/pgcrypto": No such file or directory
In database: contrib_regression_pgcrypto

The same thing as HEAD could be done on its back-branches by removing
--with-openssl and bring more consistency, but pg_upgrade has never
been good at handling upgrades with different library requirements.
Something I am wondering is if AdjustUpgrade.pm could gain more
knowledge in this area, though I am unsure how much could be achieved
as this module has only object-level knowledge.

Anyway, that's not really limited to pgcrypto, any extension with
different cross-library requirements would see that.  One example,
xml2 could be compiled with libxml and without libxslt.  It is less
popular than pgcrypto, but the failure should be the same.

I'd rather choose the shortcut of removing --with-openssl from snapper
in the short term, but that does nothing for the root issue in the
long-term.

Thoughts?
--
Michael


signature.asc
Description: PGP signature

Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

2023-07-04 Thread Peter Smith

Hi. Here are some review comments for this patch.

+1 for the patch idea.

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

Actually, I thought the FindUsableIndexForReplicaIdentityFull()
function comment is *already* describing the limitation about the
leftmost column (see fragment below), so IIUC the Sawada-san patch is
only trying to expose that same information in the PG docs.

[FindUsableIndexForReplicaIdentityFull comment fragment]
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

OTOH, it may be better if these limitation rule details were not
scattered in the code. e.g. build_replindex_scan_key() function
comment can be simplified:

CURRENT:
* This is not generic routine, it expects the idxrel to be a btree, non-partial
* and have at least one column reference (i.e. cannot consist of only
* expressions).

SUGGESTION:
This is not a generic routine. It expects the 'idxrel' to be an index
deemed "usable" by the function
FindUsableIndexForReplicaIdentityFull().

--
doc/src/sgml/logical-replication.sgml

1.
the key. When replica identity FULL is specified,
indexes can be used on the subscriber side for searching the rows.
Candidate
indexes must be btree, non-partial, and have at least one column reference
- (i.e. cannot consist of only expressions). These restrictions
- on the non-unique index properties adhere to some of the restrictions that
- are enforced for primary keys. If there are no such suitable indexes,
+ at the leftmost column indexes (i.e. cannot consist of only
expressions). These
+ restrictions on the non-unique index properties adhere to some of
the restrictions
+ that are enforced for primary keys. If there are no such suitable indexes,
the search on the subscriber side can be very inefficient, therefore
replica identity FULL should only be used as a
fallback if no other solution is possible. If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields. TBH, I am not sure the patch
wording is even describing the limitation in quite the same way as
what the code is actually doing.

HEAD (code comment):
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

HEAD (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions). These
restrictions on the non-unique index properties adhere to some of the
restrictions that are enforced for primary keys.

PATCHED (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference at the leftmost column indexes (i.e. cannot consist
of only expressions). These restrictions on the non-unique index
properties adhere to some of the restrictions that are enforced for
primary keys.

MY SUGGESTION:
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions).
Furthermore, the leftmost field of the candidate index must be a
column of the published table. These restrictions on the non-unique
index properties adhere to some of the restrictions that are enforced
for primary keys.

82 matches

Mail list logo