Re: Adding CI to our tree

2021-10-31 Thread Andres Freund
Hi,

Attached is an updated version of the CI patches.

Changes:
- more optional features are enabled on various platforms, including
  building with openssl on windows
- added somewhat minimal, README explaining how CI can enabled in a
  repository
- some cleanup to the windows crash reporting support. I also moved the
  code main.c code changes to after the CI stuff themselves, as it might
  be a bit harder to get into a committable shape (at least for me)

Greetings,

Andres Freund
>From 48e921a1838762554780357aacd8a092ed6460f2 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 15 Mar 2021 09:25:15 -0700
Subject: [PATCH v2 1/2] ci: Add CI for FreeBSD, Linux, MacOS and Windows,
 utilizing cirrus-ci.

---
 .cirrus.yml | 388 
 .dockerignore   |   3 +
 ci/README   |  36 +++
 ci/docker/linux_debian_bullseye |  13 ++
 ci/docker/windows_vs_2019   | 111 +
 ci/freebsd_gcp_repartition.sh   |  28 +++
 ci/pg_ci_base.conf  |  12 +
 ci/windows_build_config.pl  |  13 ++
 8 files changed, 604 insertions(+)
 create mode 100644 .cirrus.yml
 create mode 100644 .dockerignore
 create mode 100644 ci/README
 create mode 100644 ci/docker/linux_debian_bullseye
 create mode 100644 ci/docker/windows_vs_2019
 create mode 100755 ci/freebsd_gcp_repartition.sh
 create mode 100644 ci/pg_ci_base.conf
 create mode 100644 ci/windows_build_config.pl

diff --git a/.cirrus.yml b/.cirrus.yml
new file mode 100644
index 000..b3321a91ae6
--- /dev/null
+++ b/.cirrus.yml
@@ -0,0 +1,388 @@
+env:
+  # accelerate initial clone, but a bit of depth so that concurrent tasks work
+  CIRRUS_CLONE_DEPTH: 100
+  # Useful to be able to analyse what in a script takes long
+  CIRRUS_LOG_TIMESTAMP: true
+  # target to test, for all but windows
+  CHECK: check-world
+  CHECKFLAGS: -Otarget
+  PGCTLTIMEOUT: 120
+  CCACHE_MAXSIZE: "500M"
+  TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/ci/pg_ci_base.conf
+  PG_TEST_EXTRA: kerberos ldap ssl
+
+
+task:
+  name: FreeBSD
+  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*freebsd.*'
+  compute_engine_instance:
+image_project: pg-vm-images-aio
+image: family/pg-aio-freebsd-13-0
+platform: freebsd
+cpu: 2
+memory: 2G
+disk: 50
+  env:
+CCACHE_DIR: "/tmp/ccache_dir"
+
+  ccache_cache:
+folder: "/tmp/ccache_dir"
+  sysinfo_script:
+- export || true
+  sysconfig_script:
+- sudo sysctl kern.corefile='/tmp/%N.%P.core'
+  repartition_script:
+- ci/freebsd_gcp_repartition.sh
+  create_user_script:
+- pw useradd postgres
+- chown -R postgres:postgres .
+- mkdir -p /tmp/ccache_dir
+- chown -R postgres:postgres /tmp/ccache_dir
+
+  configure_script: |
+su postgres -c './configure \
+  --enable-cassert --enable-debug --enable-tap-tests \
+  --enable-nls \
+  \
+  --with-icu \
+  --with-ldap \
+  --with-libxml \
+  --with-libxslt \
+  \
+  --with-lz4 \
+  --with-pam \
+  --with-perl \
+  --with-python \
+  --with-ssl=openssl \
+  --with-tcl --with-tclconfig=/usr/local/lib/tcl8.6/ \
+  --with-uuid=bsd \
+  \
+  --with-includes=/usr/local/include --with-libs=/usr/local/lib \
+  CC="ccache cc"'
+  build_script:
+- su postgres -c 'gmake -s -j3 && gmake -s -j3 -C contrib'
+  upload_caches:
+- ccache
+
+  tests_script:
+- su postgres -c 'time gmake -s -j2 ${CHECK} ${CHECKFLAGS}'
+
+  on_failure:
+cores_script: |
+  for corefile in $(find /tmp -name '*.core' 2>/dev/null) ; do
+binary=$(gdb -quiet -core $corefile -batch -ex 'info auxv' | grep AT_EXECPATH | perl -pe "s/^.*\"(.*)\"\$/\$1/g") ;
+echo dumping $corefile for $binary ;
+gdb --batch --quiet -ex "thread apply all bt full" -ex "quit" $binary $corefile;
+  done
+log_artifacts:
+  path: "**/**.log"
+  type: text/plain
+regress_diffs_artifacts:
+  path: "**/**.diffs"
+  type: text/plain
+tap_artifacts:
+  path: "**/regress_log_*"
+  type: text/plain
+
+
+task:
+  name: Linux
+  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*linux.*'
+  compute_engine_instance:
+image_project: pg-vm-images-aio
+image: family/pg-aio-bullseye
+platform: linux
+cpu: 4
+memory: 2G
+nested_virtualization: false
+  env:
+CCACHE_DIR: "/tmp/ccache_dir"
+DEBUGINFOD_URLS: "https://debuginfod.debian.net;
+
+  ccache_cache:
+folder: "/tmp/ccache_dir"
+
+  sysinfo_script:
+- id
+- uname -a
+- cat /proc/cmdline
+- lsblk
+- ulimit -a -H
+- ulimit -a -S
+- export
+  sysconfig_script:
+- useradd -m postgres
+- chown -R postgres:postgres .
+- mkdir -p /tmp/ccache_dir
+- chown -R postgres:postgres /tmp/ccache_dir
+- echo '* - memlock 134217728' > /etc/security/limits.d/postgres.conf
+- 

Re: Time to drop plpython2?

2021-10-31 Thread Peter Eisentraut

On 31.10.21 19:45, Andres Freund wrote:

To me it seems time to drop plpython2 support. Supporting plpython2
until ~7 years after python2 is EOL is already quite long... It'd be one
thing if it were completely painless, but imo it's not.

I was about to list better plpython2/3 support (including the regex
replacements for the regress tests) as a TODO for the meson proposal,
but it doesn't really seem worth investing in that...


Usually, we have dropped support for older Python versions when 
continued support would be a burden for some ongoing development 
project.  This would certainly count as one of those, I think.





Re: Use -fvisibility=hidden for shared libraries

2021-10-31 Thread Andres Freund
Hi,

On 2021-11-01 05:55:40 +0100, Laurenz Albe wrote:
> I see that at least clang and gcc support this flag.

Yes - and have for a long time. 2004 or so.


> Could the reduced number of exported functions be a problem, if someone
> relies on some function being exported?  It may not be worth worrying about,
> and they can always come and make a case for that symbol to be exported.

Hm. Possible, but I don't think common. It's pretty rare to need symbols
from a postgres extension shared library from another shared
library. The most common case are transforms. To add a new transform
one, from what I've seen, needs to expose previously internal stuff from
an extension anyway... And transforms aren't all that common.

And yes, we can easily just add a few additional exports over time.


One nice advantage of this is that the !windows build behaves more
similar to the windows case, which makes it easier to detect build
breakages locally...

Greetings,

Andres Freund




Re: Missing include in be-secure-openssl.c?

2021-10-31 Thread Michael Paquier
On Sun, Oct 31, 2021 at 06:45:47PM -0400, Tom Lane wrote:
> I observe that the OpenSSL docs say you are supposed to
> #include 
> when using these functions.  We are including that header in
> some other modules, but not here.  I speculate that we've gotten
> away with that so far because of indirect inclusions; but hamerkop
> must be running an OpenSSL version that has rearranged the headers
> enough that that doesn't work anymore.  That machine was offline
> for awhile right before it started to fail, so it seems plausible
> that it was rebuilt with some pretty bleeding-edge OpenSSL version.

Hmm.  I have tested MSVC with 3.0.0 not so long ago, and this was
working, but maybe they upgraded from 1.0.2 to 3.0.0?  I'd be
surprised if this was broken in some way in one of the stable releases
as well.  We've had our share of surprises with OpenSSL when it comes
to major upgrades, but nothing that stood out when it came to minor
release compatibility and ABI, AFAIK.

> Anyway, I propose adding that #include.

openssl/ssl.h includes openssl/x509.h if OPENSSL_NO_DEPRECATED_1_1_0
is not defined, but agreed that adding the header makes sense here.

x509v3.h includes x509.h, so fe-secure-openssl.h would not need an
update.  Now could it be a better practice to include both there?
--
Michael


signature.asc
Description: PGP signature


Re: Added schema level support for publication.

2021-10-31 Thread Greg Nancarrow
On Fri, Oct 29, 2021 at 3:35 PM Amit Kapila  wrote:
>
> Sawada-San, others, what do you think? Is it really useful to have such a 
> view?
>
> One point to think is if we introduce such a view then how it should
> behave for schema objects? Do we need to display only schemas or
> additionally all the tables in the schema as well? If you follow the
> above suggestion of mine then I think it will display both schemas
> published and tables in that schema that will be considered for
> publishing.
>

I find the proposed view useful for processing the publication
structure and members in SQL, without having to piece the information
together from the other pg_publication_* tables.
Personally I don't think it is necessary to additionally display all
tables in the schema (that information can be retrieved by pg_tables
or information_schema.tables, if needed).


Regards,
Greg Nancarrow
Fujitsu Australia




Re: Use -fvisibility=hidden for shared libraries

2021-10-31 Thread Laurenz Albe
On Sun, 2021-10-31 at 19:03 -0700, Andres Freund wrote:
> Currently postgres builds extension shared libraries (i.e. pretty much
> everything but libpq) with global visibilty. I.e. symbols that are not
> static will be visible from outside the shared library.
> 
> On the one platform where that behaviour is not available, namely
> windows, we emulate it by analyzing the input files to the shared
> library and exporting all the symbols therein.
> 
> For the meson build [1] proposal that turned out to be a bit more
> verbose to implement than I'd liked. Thinking about the problem I
> realized that, at least I think so, there's really no good reason to
> force-export symbols in our shared libraries:
> 
> Because the number of symbols required from shared libraries is
> typically small, and the majority of the symbols are generated via
> PG_FUNCTION_INFO_V1, it isn't a lot of work to explicitly export the
> necessary symbols.

That sounds like a good idea.

I see that at least clang and gcc support this flag.

Could the reduced number of exported functions be a problem, if someone
relies on some function being exported?  It may not be worth worrying about,
and they can always come and make a case for that symbol to be exported.

Yours,
Laurenz Albe





Re: pg_receivewal starting position

2021-10-31 Thread Michael Paquier
On Fri, Oct 29, 2021 at 10:13:44AM +0200, Ronan Dunklau wrote:
> We could use a single query on the primary (using the primary's checkpoint 
> LSN 
> instead)  but it feels a bit convoluted just to avoid a query on the standby.

Cheating with pg_walfile_name() running on the primary is fine by me.
One thing that stood out while reading this patch again is that we
can use $standby->slot($archive_slot) to grab the slot's restart_lsn.
I have changed that, and applied it.  So we are done with this
thread.
--
Michael


signature.asc
Description: PGP signature


make update-po problem with USE_PGXS

2021-10-31 Thread wangsh.f...@fujitsu.com
Hi,

I'm developing an extension, after modifying some source file(add some message 
like gettext('x')) and execute
> $ make USE_PGXS=1 update-po
> make: Nothing to be done for 'update-po'.

I feel strange because I think *.po.new files should be created.

After some research, I find in $pginstall/lib/postgresql/pgxs/src/nls-global.mk
> ifneq (,$(filter update-po %.po.new,$(MAKECMDGOALS)))
> ALL_LANGUAGES := $(shell find $(top_srcdir) -name '*.po' -print | sed 
> 's,^.*/\([^/]*\).po$$,\1,' | LC_ALL=C sort -u)
> all_compendia := $(shell find $(top_srcdir) -name '*.po' -print | LC_ALL=C 
> sort)
> else

When USE_PGXS is on, the value of $(top_srcdir) is 
> $pginstall/lib/postgresql/pgxs/src/makefiles/../..
Therefore, command find will nerver find any po file.

I think this is a problem and plan to modify this,
when USE_PGXS is on, we can use:
find . -name '*.po' -print

Or there are some advice/comments?

Regards
Shenhao Wang





Re: Opclass parameters of indexes lost after REINDEX CONCURRENTLY[

2021-10-31 Thread Michael Paquier
On Sat, Oct 30, 2021 at 09:26:35PM +0900, Michael Paquier wrote:
> Yeah, you are right that it would be better here to use
> get_attoptions() to grab a copy of each attribute's option directly
> from the catalogs.  We also do that for predicates and expressions.

While looking again at this one this morning, I have extended the
tests with more columns and some default values, then applied the
patch by using get_attoptions() to grab each attribute's options as of
add5cf2.  Predicates and expressions are grabbed through the syscache,
so that's just more consistent.
--
Michael


signature.asc
Description: PGP signature


Re: inefficient loop in StandbyReleaseLockList()

2021-10-31 Thread Kyotaro Horiguchi
At Sun, 31 Oct 2021 16:55:01 -0400, Tom Lane  wrote in 
> I wrote:
> > Pushed the patch as given.  I've not yet reviewed other list_delete_first
> > callers, but I'll take a look.  (I seem to remember that I did survey
> > them while writing 1cff1b95a, but I evidently missed that this code
> > could be dealing with a list long enough to be problematic.)
> 
> I looked at the remaining list_delete_first callers.
> 
> 1. Attached is a proposed patch to get rid of the calls in trgm_regexp.c.
> I'm not certain that those lists could get long enough to be a problem,
> given the existing complexity limits in that file (MAX_EXPANDED_STATES
> etc).  But I'm not certain they can't, either, and it's easy enough to
> fix along the same lines as in StandbyReleaseLockList.

I should be missing something, but at the added list_free() there's a
case where keysQueue has some elelments.  I think the remaining
elements are useless but AFAICS all the memory allocated there is
freed after createTrgmNFAInternal returnes, before the "next cycle"
comes. Do we need to add that list_free there?

> 2. I think we almost certainly have a problem in SyncPostCheckpoint.

Maybe we want list_delete_first_n() or such to remove the first n
elements in a list at once.

> 3. Is agg_refill_hash_table a problem?  Probably; we've seen cases
> with lots of batches.

I excluded it since I'm not sure it is in the pattern at a glance. I
would want to leave it alone, since changing the logic there seems
making things a bit complex and the gain by removing list_delete_first
doesn't look so large..

> 4. I'm a bit worried about the uses in access/gist/, but I don't know
> that code well enough to want to mess with it.  It's possible the
> list lengths are bounded by the index tree height, in which case it
> likely doesn't matter.  The logic in gistFindPath looks like a mess
> anyway since it's appending to both ends of the "fifo" list in different
> places (is that really necessary?).

>From the other side, the elemnts are inserted by lcons, then removed
by list_delete_first.  It is the worst usage of the current list
implementation as a FIFO. Couldn't we construct and iterate over a
list in the reverse order?

> 5. Not sure about CopyMultiInsertInfoFlush ... how many buffers
> could we have there?

(I'm not sure..)

> 6. llvm_release_context may not have a long enough list to be a
> problem, but on the other hand, it looks easy to fix.

Agreed.

> 7. The list lengths in the parser and dependency.c, ruleutils.c,
> explain.c are bounded by subquery nesting depth or plan tree depth,
> so I doubt it's worth worrying about.

Agreed.

> 8. The uses in namespace.c don't seem like an issue either -- for
> instance, GetOverrideSearchPath can't iterate more than twice,
> and the overrideStack list shouldn't get very deep.

If we didn't need the resulting list I'm for changing it but actually
it is needed. So I think we won't get so much by changing the
function.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Use -fvisibility=hidden for shared libraries

2021-10-31 Thread Andres Freund
Hi,

Currently postgres builds extension shared libraries (i.e. pretty much
everything but libpq) with global visibilty. I.e. symbols that are not
static will be visible from outside the shared library.

On the one platform where that behaviour is not available, namely
windows, we emulate it by analyzing the input files to the shared
library and exporting all the symbols therein.

For the meson build [1] proposal that turned out to be a bit more
verbose to implement than I'd liked. Thinking about the problem I
realized that, at least I think so, there's really no good reason to
force-export symbols in our shared libraries:

Because the number of symbols required from shared libraries is
typically small, and the majority of the symbols are generated via
PG_FUNCTION_INFO_V1, it isn't a lot of work to explicitly export the
necessary symbols.


I think this is a much more feasible proposal than, as has been
suggested in the past, using explicit visibility for the entire
backend. The amount of functions that'd need to be annotated in the
backend is huge, and I don't think we have a reasonable starting point
for limiting what we'd export.


There are a few extension libs that need manual export
annotations. These are libraries that are not just referenced by the
backend, but also from other extensions, e.g. for transforms.


The patch e.g. reduces the number of exported symbols for

- plpython: 61 -> 29
- plperl: 32 -> 17
- llvmjit: 84 -> 5
- postgres_fdw: 48 -> 15
- dict_snowball: 182 -> 8

Besides the smaller number of symbols (which can impact library load
times a bit, although the numbers here are likely small enough to not
matter), using -fvisibility=hidden also can improve code generation,
because it allows the compiler & linker to generate a plain function
call, rather than having to go through the elf symbol-interposition
table.


The attached patch is a prototype of this idea.

Greetings,

Andres Freund

[1] 
https://www.postgresql.org/message-id/20211012083721.hvixq4pnh2pixr3j%40alap3.anarazel.de
>From a5a1d25c3e9773edda5fbb73c04c4c13fece72e2 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sun, 31 Oct 2021 18:02:16 -0700
Subject: [PATCH v2 1/3] Use hidden visibility for shared libraries where
 possible.

---
 configure  | 153 +
 configure.ac   |  14 ++
 contrib/hstore/hstore.h|  16 +--
 contrib/ltree/ltree.h  |  40 +++---
 src/Makefile.global.in |   2 +
 src/Makefile.shlib |  13 ++
 src/include/c.h|  15 +-
 src/include/fmgr.h |  12 +-
 src/include/jit/jit.h  |   2 +-
 src/include/pg_config.h.in |   3 +
 src/include/replication/output_plugin.h|   2 +
 src/pl/plpython/plpy_elog.h|   8 +-
 src/pl/plpython/plpy_typeio.h  |  18 +--
 src/pl/plpython/plpy_util.h|   8 +-
 src/test/modules/test_shm_mq/test_shm_mq.h |   2 +-
 src/test/modules/worker_spi/worker_spi.c   |   2 +-
 src/tools/msvc/Solution.pm |   1 +
 17 files changed, 258 insertions(+), 53 deletions(-)

diff --git a/configure b/configure
index 4ffefe46552..9057791eb00 100755
--- a/configure
+++ b/configure
@@ -735,6 +735,8 @@ CPP
 CFLAGS_SL
 BITCODE_CXXFLAGS
 BITCODE_CFLAGS
+CXXFLAGS_SL_MOD
+CFLAGS_SL_MOD
 CFLAGS_VECTORIZE
 CFLAGS_UNROLL_LOOPS
 PERMIT_DECLARATION_AFTER_STATEMENT
@@ -6421,6 +6423,155 @@ fi
   if test -n "$NOT_THE_CFLAGS"; then
 CFLAGS="$CFLAGS -Wno-stringop-truncation"
   fi
+
+  # If the compiler knows how to hide symbols, use that. But only for shared libraries,
+  # for postgres itself that'd be too verbose for now.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports -fvisibility=hidden, for CFLAGS_SL_MOD" >&5
+$as_echo_n "checking whether ${CC} supports -fvisibility=hidden, for CFLAGS_SL_MOD... " >&6; }
+if ${pgac_cv_prog_CC_cflags__fvisibility_hidden+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS_SL_MOD} -fvisibility=hidden"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__fvisibility_hidden=yes
+else
+  pgac_cv_prog_CC_cflags__fvisibility_hidden=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_prog_CC_cflags__fvisibility_hidden" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__fvisibility_hidden" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__fvisibility_hidden" = x"yes"; then
+  CFLAGS_SL_MOD="${CFLAGS_SL_MOD} -fvisibility=hidden"
+fi
+
+
+  if test 

Re: Skipping logical replication transactions on subscriber side

2021-10-31 Thread Masahiko Sawada
On Sat, Oct 30, 2021 at 12:21 PM Amit Kapila  wrote:
>
> On Fri, Oct 29, 2021 at 4:49 PM Amit Kapila  wrote:
> >
> > On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada  
> > wrote:
> > >
> > > I'm not sure it's better to drop apply worker stats after resetting
> > > skip xid (i.g., after skipping the transaction). Since the view is a
> > > cumulative view and has last_error_time, I thought we can have the
> > > apply worker stats until the subscription gets dropped.
> > >
> >
> > Fair enough. So statistics can be removed either by vacuum or drop
> > subscription. Also, if we go by this logic then there is no harm in
> > retaining the stat entries for tablesync errors. Why have different
> > behavior for apply and tablesync workers?
> >
> > I have another question in this regard. Currently, the reset function
> > seems to be resetting only the first stat entry for a subscription.
> > But can't we have multiple stat entries for a subscription considering
> > the view's cumulative nature?
> >
>
> Don't we want these stats to be dealt in the same way as tables and
> functions as all the stats entries (subscription entries) are specific
> to a particular database? If so, I think we should write/read these
> to/from db specific stats file in the same way as we do for tables or
> functions. I think in the current patch, it will unnecessarily read
> and probably write subscription stats even when those are not
> required.

Good point! So probably we should have PgStat_StatDBEntry have the
hash table for subscription worker statistics, right?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Skipping logical replication transactions on subscriber side

2021-10-31 Thread Masahiko Sawada
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila  wrote:
>
> On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada  
> wrote:
> >
> > On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila  wrote:
> > >
> > > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada 
> > > > >  wrote:
> > > > > >
> > > > > >
> > > > > > I've attached updated patches.
> > > >
> > > > Thank you for the comments!
> > > >
> > > > >
> > > > > Few comments:
> > > > > ==
> > > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > > is successful (say when we reset skip_xid or when tablesync is
> > > > > finished) or when we drop subscription? I think the same applies to
> > > > > apply worker. I think we may want to track it in some way whether an
> > > > > error has occurred before sending the message but relying completely
> > > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > > drop subscription we can simply send the message as that is not a
> > > > > frequent operation. I might be missing something here because in the
> > > > > tests after drop subscription you are expecting the entries from the
> > > > > view to get cleared
> > > >
> > > > Yes, I think we can have tablesync worker send a message to drop stats
> > > > once tablesync is successful. But if we do that also when dropping a
> > > > subscription, I think we need to do that only the transaction is
> > > > committed since we can drop a subscription that doesn't have a
> > > > replication slot and rollback the transaction. Probably we can send
> > > > the message only when the subscritpion does have a replication slot.
> > > >
> > >
> > > Right. And probably for apply worker after updating skip xid.
> >
> > I'm not sure it's better to drop apply worker stats after resetting
> > skip xid (i.g., after skipping the transaction). Since the view is a
> > cumulative view and has last_error_time, I thought we can have the
> > apply worker stats until the subscription gets dropped.
> >
>
> Fair enough. So statistics can be removed either by vacuum or drop
> subscription. Also, if we go by this logic then there is no harm in
> retaining the stat entries for tablesync errors. Why have different
> behavior for apply and tablesync workers?

My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped. Also, I’m concerned about a situation like where a lot of
table sync failed. In which case, if we don’t drop table sync worker
statistics after completing its job, we end up having a lot of entries
in the view unless the subscription is dropped.

>
> I have another question in this regard. Currently, the reset function
> seems to be resetting only the first stat entry for a subscription.
> But can't we have multiple stat entries for a subscription considering
> the view's cumulative nature?

I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription. And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: parallel vacuum comments

2021-10-31 Thread Masahiko Sawada
On Sun, Oct 31, 2021 at 6:21 AM Andres Freund  wrote:
>
> Hi,
>
> Due to bug #17245: [1] I spent a considerably amount of time looking at vacuum
> related code. And I found a few things that I think could stand improvement:
>
> - There's pretty much no tests. This is way way too complicated feature for
>   that. If there had been tests for the obvious edge case of some indexes
>   being too small to be handled in parallel, but others needing parallelism,
>   the mistake leading to #17245 would have been caught during development.

Yes. We should have tests at least for such cases.

>
>
> - There should be error check verifying that all indexes have actually been
>   vacuumed. It's way too easy to have bugs leading to index vacuuming being
>   skipped.

Agreed.

>
>
> - The state machine is complicated. It's very unobvious that an index needs to
>   be processed serially by the leader if shared_indstats == NULL.

I think we can consolidate the logic that decides who (a worker or the
leader) processes the index in one function.

>
>
> - I'm very confused by the existance of LVShared->bitmap. Why is it worth
>   saving 7 bits per index for something like this (compared to a simple
>   array of bools)? Nor does the naming explain what it's for.
>
>   The presence of the bitmap requires stuff like SizeOfLVShared(), which
>   accounts for some of the bitmap size, but not all?

Yes, it's better to account for the size of all bitmaps.

>
>   But even though we have this space optimized bitmap thing, we actually need
>   more memory allocated for each index, making this whole thing pointless.

Right. But is better to change to use booleans?

> - Imo it's pretty confusing to have functions like
>   lazy_parallel_vacuum_indexes() (in 13, renamed in 14) that "Perform index
>   vacuum or index cleanup with parallel workers.", based on
>   lps->lvshared->for_cleanup.

Okay. We need to set lps->lvshared->for_cleanup to tell worker do
either index vacuum or index cleanup. So it might be better to pass
for_cleanup flag down to the functions in addition to setting
lps->lvshared->for_cleanup.

>
>
> - I don't like some of the new names introduced in 14. E.g.
>   "do_parallel_processing" is way too generic.

I listed the function names that probably needs to be renamed from
that perspecti:

* do_parallel_processing
* do_serial_processing_for_unsafe_indexes
* parallel_process_one_index

Is there any other function that should be renamed?


> - On a higher level, a lot of this actually doesn't seem to belong into
>   vacuumlazy.c, but should be somewhere more generic. Pretty much none of this
>   code is heap specific.  And vacuumlazy.c is large enough without the 
> parallel
>   code.

I don't come with an idea to make them more generic. Could you
elaborate on that?

I've started to write a patch for these comments.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/e




Re: enhance pg_log_backend_memory_contexts() to log memory contexts of auxiliary processes

2021-10-31 Thread Kyotaro Horiguchi
At Fri, 29 Oct 2021 22:25:04 +0530, Bharath Rupireddy 
 wrote in 
> On Mon, Oct 11, 2021 at 9:55 AM Bharath Rupireddy
>  wrote:
> >
> > On Mon, Oct 11, 2021 at 8:21 AM torikoshia  
> > wrote:
> > > If we can use debuggers, it's possible to know the memory contexts e.g.
> > > using MemoryContextStats().
> > > So IMHO if it's necessary to know memory contexts without attaching gdb
> > > for other than client backends(probably this means using under
> > > production environment), this enhancement would be pay.
> >
> > Thanks for providing your thoughts. Knowing memory usage of auxiliary
> > processes is as important as backends (user session processes) without
> > attaching debugger in production environments.
> >
> > There are some open points as mentioned in my first mail in this
> > thread, I will start working  on this patch once we agree on them.
> 
> I'm attaching the v1 patch that enables
> pg_log_backend_memory_contexts() to log memory contexts of auxiliary
> processes. Please review it.
> 
> Here's the CF entry - https://commitfest.postgresql.org/35/3385/

After the patch applied the function looks like this

  proc = BackendPidGetProc(pid);
  if (proc == NULL)


  if (proc == NULL)

  if (!is_aux_proc)

  SendProcSignal(.., the backend id);

is_aux_proc lookslike making the code complex.  I think we can remove
it.


+   /* Only regular backends will have valid backend id, auxiliary 
processes don't. */
+   if (!is_aux_proc)
+   backendId = proc->backendId;

I think the reason we need to do this is not that aux processes have
the invalid backend id (=InvalidBackendId) but that "some" auxiliary
processes may have a broken proc->backendId in regard to
SendProcSignal (we know that's the startup for now.).


+SELECT pg_log_backend_memory_contexts(memcxt_get_proc_pid('autovacuum 
launcher'+SELECT pg_log_backend_memory_contexts(memcxt_get_proc_pid('logical 
replication launcher'));
...

Maybe we can reduce (a quite bit of) run time of the test by
loopingover the processes but since the test only checks if the
function doesn't fail to send a signal, I'm not sure we need to
perform the test for all of the processes here.  On the other hand,
the test is missing the most significant target of the startup
process.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-10-31 Thread Mark Dilger



> On Oct 29, 2021, at 4:46 PM, Jeff Davis  wrote:
> 
> But I don't think the concept of role ownership has zero potential
> confusion, either. For instance, I could certainly imagine a user A
> creating a role B (and therefore owning it), and then doing "GRANT A TO
> B". Is there a reason to do that, or is the user confused about what
> membership versus ownership mean?

In general, I think that would be the result of the user being confused.  But 
it is hard to say that definitively, because perhaps users A and C want to 
create a single user B with the union of both their roles, and have agreed to 
perform:

user_a%  CREATE ROLE B;
user_a%  GRANT A TO B;
user_c%  GRANT C TO B;

The easiest way of thinking about role ownership is that a role's owner is 
superuser in so far as that role is concerned.  It can drop them, modify them, 
take their objects away from them, assign other objects to them, etc.  Anything 
a superuser could do to impoverish them, their owner can do to impoverish them. 
 The difference is that an actual superuser can enrich them with anything the 
superuser likes, whereas their owner can only enrich them with objects and 
privileges that the owner itself has rights to assign.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Missing include in be-secure-openssl.c?

2021-10-31 Thread Tom Lane
hamerkop has been failing recently [1] with

  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(67):
 error C2143: \215\\\225\266\203G\203\211\201[: ')' \202\252 '(' 
\202\314\221O\202\311\202\240\202\350\202\334\202\271\202\361\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(67):
 error C2091: 
\212\326\220\224\202\315\212\326\220\224\202\360\225\324\202\271\202\334\202\271\202\361\201B
 [c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(67):
 error C2059: \215\\\225\266\203G\203\211\201[: ')' 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(67):
 error C2143: \215\\\225\266\203G\203\211\201[: ')' \202\252 '\222\350\220\224' 
\202\314\221O\202\311\202\240\202\350\202\334\202\271\202\361\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(67):
 error C2143: \215\\\225\266\203G\203\211\201[: '{' \202\252 '\222\350\220\224' 
\202\314\221O\202\311\202\240\202\350\202\334\202\271\202\361\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(67):
 error C2059: \215\\\225\266\203G\203\211\201[: '\222\350\220\224' 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(571):
 error C2065: 'x509name': 
\222\350\213`\202\263\202\352\202\304\202\242\202\310\202\242\216\257\225\312\216q\202\305\202\267\201B
 [c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(571):
 error C2296: '*': 
\226\263\214\370\202\305\202\267\201B\215\266\203I\203y\203\211\203\223\203h\202\311\202\315\214^
 'LPCSTR' 
\202\252\216w\222\350\202\263\202\352\202\304\202\242\202\334\202\267\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(576):
 error C2065: 'x509name': 
\222\350\213`\202\263\202\352\202\304\202\242\202\310\202\242\216\257\225\312\216q\202\305\202\267\201B
 [c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(582):
 error C2065: 'x509name': 
\222\350\213`\202\263\202\352\202\304\202\242\202\310\202\242\216\257\225\312\216q\202\305\202\267\201B
 [c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(623):
 error C2065: 'x509name': 
\222\350\213`\202\263\202\352\202\304\202\242\202\310\202\242\216\257\225\312\216q\202\305\202\267\201B
 [c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(1377):
 error C2143: \215\\\225\266\203G\203\211\201[: ')' \202\252 '(' 
\202\314\221O\202\311\202\240\202\350\202\334\202\271\202\361\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(1377):
 error C2091: 
\212\326\220\224\202\315\212\326\220\224\202\360\225\324\202\271\202\334\202\271\202\361\201B
 [c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(1377):
 error C2059: \215\\\225\266\203G\203\211\201[: ')' 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(1377):
 error C2143: \215\\\225\266\203G\203\211\201[: ')' \202\252 '\222\350\220\224' 
\202\314\221O\202\311\202\240\202\350\202\334\202\271\202\361\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(1377):
 error C2143: \215\\\225\266\203G\203\211\201[: '{' \202\252 '\222\350\220\224' 
\202\314\221O\202\311\202\240\202\350\202\334\202\271\202\361\201B 
[c:\\build-farm-local\\buildroot\\HEAD\\pgsql.build\\postgres.vcxproj]
  
c:\\build-farm-local\\buildroot\\head\\pgsql.build\\src\\backend\\libpq\\be-secure-openssl.c(1377):
 error C2059: \215\\\225\266\203G\203\211\201[: '\222\350\220\224' 

Re: row filtering for logical replication

2021-10-31 Thread Peter Smith
The v34* patch set is temporarily broken.

It was impacted quite a lot by the recently committed "schema
publication" patch [1].

We are actively fixing the full v34* patch set and will re-post it
here as soon as the re-base hurdles can be overcome.

Meanwhile, the small tab-complete patch (which is independent of the
others) is the only patch currently working, so I am attaching it so
at least the cfbot can have something to run.

--
[1] 
https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd

Kind Regards,
Peter Smith.
Fujitsu Australia


v35-0001-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patch
Description: Binary data


Re: Add additional information to src/test/ssl/README

2021-10-31 Thread Tom Lane
Daniel Gustafsson  writes:
> On 31 Oct 2021, at 19:28, Tom Lane  wrote:
>> Here's a quick stab at rearranging src/test/perl/README so that the
>> initial section is all how-to-run-the-tests info, and advice about
>> writing new tests starts after that.  Your point about PG_TEST_NOCLEAN
>> could be added into the initial section.

> That's pretty much just what I was thinking, definitely +1 on this patch.

Done that way; feel free to add more material to perl/README.

regards, tom lane




Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Michael Banck
Hi,

On Sun, Oct 31, 2021 at 01:16:33PM -0700, Andres Freund wrote:
> On 2021-10-31 15:43:57 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2021-10-31 10:59:19 -0400, Tom Lane wrote:
> > >> No DBA would be likely to consider it as anything but log spam.
> > 
> > > I don't agree at all. No postgres instance should be run without
> > > log_checkpoints enabled. Performance is poor if checkpoints are
> > > triggered by anything but time, and that can only be diagnosed if
> > > log_checkpoints is on.
> > 
> > This is complete nonsense.
> 
> Shrug. It's based on many years of doing or being around people doing
> postgres support escalation shifts. And it's not like log_checkpoints
> incurs meaningful overhead or causes that much log volume.

It could be a bit of reverse-survivorship-bias, i.e., you're only seeing
the pathological cases, while 99% of the Postgres installations out
there just hum along fine without anybody ever having to look at the
checkpoint entries in the log.

But yeah, for serious production instances, it makes sense to turn that
option on, but I'm not convinced it should be the default.

To put another option on the table: maybe a compromise could be to log
xlog checkpoints unconditionally, and the (checkpoint_timeout) time ones
only if log_checkpoints are set (maybe with some exponential backoff to
avoid log spam)?


Michael

-- 
Michael Banck
Team Lead PostgreSQL
Project Manager
Tel.: +49 2166 9901-171
Mail: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Management: Dr. Michael Meskes, Geoff Richardson, Peter Lilley

Our handling of personal data is subject to:
https://www.credativ.de/en/contact/privacy/




Re: Added schema level support for publication.

2021-10-31 Thread Tomas Vondra

Hi,

On 10/28/21 04:41, Amit Kapila wrote:

On Mon, Oct 25, 2021 at 3:09 PM Amit Kapila  wrote:


On Mon, Oct 25, 2021 at 1:11 PM vignesh C  wrote:


I have fixed this in the v47 version attached.



Thanks, the first patch in the series "Allow publishing the tables of
schema." looks good to me. Unless there are more
comments/bugs/objections, I am planning to commit it in a day or so.



Yesterday, I have pushed the first patch. Feel free to submit the
remaining patches.



I haven't been following this thread recently, but while rebasing the 
sequence decoding patch I noticed this adds


PUBLICATIONOBJ_TABLE,/* Table type */
PUBLICATIONOBJ_REL_IN_SCHEMA,/* Relations in schema type */

Shouldn't it be PUBLICATIONOBJ_TABLE_IN_SCHEMA, or why does it use rel 
instead of table?


I'm asking because the sequence decoding patch mimics ALTER PUBLICATION 
options for sequences, including ALL SEQUENCES IN SCHEMA etc. and this 
seems ambiguous. The same issue applies to PUBLICATIONOBJ_CURRSCHEMA, 
which does not specify the object type.


I wonder if it'd be better to just separate the schema and object type 
specification, instead of mashing it into a single constant. Otherwise 
we'll end up with (M x N) combinations, which seems silly.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Tomas Vondra

On 10/31/21 21:16, Andres Freund wrote:

Hi,

On 2021-10-31 15:43:57 -0400, Tom Lane wrote:

Andres Freund  writes:

On 2021-10-31 10:59:19 -0400, Tom Lane wrote:

No DBA would be likely to consider it as anything but log spam.



I don't agree at all. No postgres instance should be run without
log_checkpoints enabled. Performance is poor if checkpoints are
triggered by anything but time, and that can only be diagnosed if
log_checkpoints is on.


This is complete nonsense.


Shrug. It's based on many years of doing or being around people doing
postgres support escalation shifts. And it's not like log_checkpoints
incurs meaningful overhead or causes that much log volume.



Yeah. In tuned instances the checkpoints happen fairly infrequently most 
of the time (occasional batch loads being an exception, etc.), so the 
extra log traffic should be fairly negligible. And it's not like we can 
make checkpointer infinitely smart - sometimes the cause is a change in 
the workload etc.


OTOH most of this data (# of timed/xlog checkpoints, buffers written by 
checkpointer etc.) is available in the pg_stat_bgwriter view, and people 
generally have monitoring these days.


I don't have a strong opinion on changing the default - we generally 
recommend changing it, and we'll keep doing it, but it's usually part of 
various other recommendations so this one tweak does not eliminate that.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: inefficient loop in StandbyReleaseLockList()

2021-10-31 Thread Tom Lane
I wrote:
> Pushed the patch as given.  I've not yet reviewed other list_delete_first
> callers, but I'll take a look.  (I seem to remember that I did survey
> them while writing 1cff1b95a, but I evidently missed that this code
> could be dealing with a list long enough to be problematic.)

I looked at the remaining list_delete_first callers.

1. Attached is a proposed patch to get rid of the calls in trgm_regexp.c.
I'm not certain that those lists could get long enough to be a problem,
given the existing complexity limits in that file (MAX_EXPANDED_STATES
etc).  But I'm not certain they can't, either, and it's easy enough to
fix along the same lines as in StandbyReleaseLockList.

2. I think we almost certainly have a problem in SyncPostCheckpoint.

3. Is agg_refill_hash_table a problem?  Probably; we've seen cases
with lots of batches.

4. I'm a bit worried about the uses in access/gist/, but I don't know
that code well enough to want to mess with it.  It's possible the
list lengths are bounded by the index tree height, in which case it
likely doesn't matter.  The logic in gistFindPath looks like a mess
anyway since it's appending to both ends of the "fifo" list in different
places (is that really necessary?).

5. Not sure about CopyMultiInsertInfoFlush ... how many buffers
could we have there?

6. llvm_release_context may not have a long enough list to be a
problem, but on the other hand, it looks easy to fix.

7. The list lengths in the parser and dependency.c, ruleutils.c,
explain.c are bounded by subquery nesting depth or plan tree depth,
so I doubt it's worth worrying about.

8. The uses in namespace.c don't seem like an issue either -- for
instance, GetOverrideSearchPath can't iterate more than twice,
and the overrideStack list shouldn't get very deep.

regards, tom lane

diff --git a/contrib/pg_trgm/trgm_regexp.c b/contrib/pg_trgm/trgm_regexp.c
index 64816dd370..1887a39161 100644
--- a/contrib/pg_trgm/trgm_regexp.c
+++ b/contrib/pg_trgm/trgm_regexp.c
@@ -907,6 +907,7 @@ transformGraph(TrgmNFA *trgmNFA)
 	HASHCTL		hashCtl;
 	TrgmStateKey initkey;
 	TrgmState  *initstate;
+	ListCell   *lc;
 
 	/* Initialize this stage's workspace in trgmNFA struct */
 	trgmNFA->queue = NIL;
@@ -937,12 +938,13 @@ transformGraph(TrgmNFA *trgmNFA)
 	/*
 	 * Recursively build the expanded graph by processing queue of states
 	 * (breadth-first search).  getState already put initstate in the queue.
+	 * Note that getState will append new states to the queue within the loop,
+	 * too; this works as long as we don't do repeat fetches using the "lc"
+	 * pointer.
 	 */
-	while (trgmNFA->queue != NIL)
+	foreach(lc, trgmNFA->queue)
 	{
-		TrgmState  *state = (TrgmState *) linitial(trgmNFA->queue);
-
-		trgmNFA->queue = list_delete_first(trgmNFA->queue);
+		TrgmState  *state = (TrgmState *) lfirst(lc);
 
 		/*
 		 * If we overflowed then just mark state as final.  Otherwise do
@@ -966,22 +968,29 @@ transformGraph(TrgmNFA *trgmNFA)
 static void
 processState(TrgmNFA *trgmNFA, TrgmState *state)
 {
+	ListCell   *lc;
+
 	/* keysQueue should be NIL already, but make sure */
 	trgmNFA->keysQueue = NIL;
 
 	/*
 	 * Add state's own key, and then process all keys added to keysQueue until
-	 * queue is empty.  But we can quit if the state gets marked final.
+	 * queue is finished.  But we can quit if the state gets marked final.
 	 */
 	addKey(trgmNFA, state, >stateKey);
-	while (trgmNFA->keysQueue != NIL && !(state->flags & TSTATE_FIN))
+	foreach(lc, trgmNFA->keysQueue)
 	{
-		TrgmStateKey *key = (TrgmStateKey *) linitial(trgmNFA->keysQueue);
+		TrgmStateKey *key = (TrgmStateKey *) lfirst(lc);
 
-		trgmNFA->keysQueue = list_delete_first(trgmNFA->keysQueue);
+		if (state->flags & TSTATE_FIN)
+			break;
 		addKey(trgmNFA, state, key);
 	}
 
+	/* Release keysQueue to clean up for next cycle */
+	list_free(trgmNFA->keysQueue);
+	trgmNFA->keysQueue = NIL;
+
 	/*
 	 * Add outgoing arcs only if state isn't final (we have no interest in
 	 * outgoing arcs if we already match)


small change to comment for ATExecDetachPartition

2021-10-31 Thread Zhihong Yu
Hi,
I was going over:

commit 71f4c8c6f74ba021e55d35b1128d22fb8c6e1629
Author: Alvaro Herrera 
Date:   Thu Mar 25 18:00:28 2021 -0300

ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY

It seems there were some missing words in one of the comments.

See patch attached.

Cheers


detach-partition-doc.patch
Description: Binary data


Re: logical decoding and replication of sequences

2021-10-31 Thread Tomas Vondra

On 9/25/21 22:05, Hannu Krosing wrote:

Just a note for some design decisions


1) By default, sequences are treated non-transactionally, i.e. sent to the 
output plugin right away.


If our aim is just to make sure that all user-visible data in
*transactional* tables is consistent with sequence state then  one
very much simplified approach to this could be to track the results of
nextval() calls in a transaction at COMMIT put the latest sequence
value in WAL (or just track the sequences affected and put the latest
sequence state in WAL at commit which needs extra read of sequence but
protects against race conditions with parallel transactions which get
rolled back later)



Not sure. TBH I feel rather uneasy about adding more stuff in COMMIT.


This avoids sending redundant changes for multiple nextval() calls
(like loading a million-row table with sequence-generated id column)



Yeah, it'd be nice to have to optimize this a bit, somehow. But I'd bet 
it's a negligible amount of data / changes, compared to the table.



And one can argue that we can safely ignore anything in ROLLBACKED
sequences. This is assuming that even if we did advance the sequence
paste the last value sent by the latest COMMITTED transaction it does
not matter for database consistency.



I don't think we can ignore aborted (ROLLBACK) transactions, in the 
sense that you can't just discard the increments. Imagine you have this 
sequence of transactions:


BEGIN;
SELECT nextval('s');  -- allocates new chunk of values
ROLLBACK;

BEGIN;
SELECT nextval('s'); -- returns one of the cached values
COMMIT;

If you ignore the aborted transaction, then the sequence increment won't 
be replicated -- but that's wrong, because user now has a visible 
sequence value from that chunk.


So I guess we'd have to maintain a cache of sequences incremented in the 
current session, do nothing in aborted transactions (i.e. keep the 
contents but don't log anything) and log/reset at commit.


I wonder if multiple sessions make this even more problematic (e.g. due 
to session just disconnecting mid transansaction, without writing the 
abort record at all). But AFAICS that's not an issue, because the other 
session has a separate cache for the sequence.



It can matter if customers just call nextval() in rolled-back
transactions and somehow expect these values to be replicated based on
reasoning along "sequences are not transactional - so rollbacks should
not matter" .



I don't think we guarantee anything for data in transactions that did 
not commit, so this seems like a non-issue. I.e. we don't need to go out 
of our way to guarantee something we never promised.



Or we may get away with most in-detail sequence tracking on the source
if we just keep track of the xmin of the sequence and send the
sequence info over at commit if it == current_transaction_id ?



Not sure I understand this proposal. Can you explain?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Jan Wieck

On 10/31/21 16:16, Andres Freund wrote:

Hi,

On 2021-10-31 15:43:57 -0400, Tom Lane wrote:

Andres Freund  writes:
> On 2021-10-31 10:59:19 -0400, Tom Lane wrote:
>> No DBA would be likely to consider it as anything but log spam.

> I don't agree at all. No postgres instance should be run without
> log_checkpoints enabled. Performance is poor if checkpoints are
> triggered by anything but time, and that can only be diagnosed if
> log_checkpoints is on.

This is complete nonsense.


Shrug. It's based on many years of doing or being around people doing
postgres support escalation shifts. And it's not like log_checkpoints
incurs meaningful overhead or causes that much log volume.


I agree with Andres 100%. Whenever called to diagnose any type of 
problems this is on the usual checklist and very few customers have it 
turned on. The usefulness of this information very much outweighs the 
tiny amount of extra log created.







If we think that's a generic problem, we should be fixing the problem
(ie, making the checkpointer smarter);


We've made it less bad (checkpoint_segments -> max_wal_size, sorting IO
for checkpoints, forcing the OS to flush writes earlier). But it's still
a significant issue. It's not that easy to make it better.


And we kept the default for max_wal_size at 1GB. While it is a "soft" 
limit, it is the main reason why instances are running full bore with a 
huge percentage of full page writes because it is way too small for 
their throughput and nothing in the logs warns them about it. I can run 
a certain TPC-C workload on an 8-core machine quite comfortably when 
max_wal_size is configured at 100G. The exact same TPC-C configuration 
will spiral the machine down if left with default max_wal_size and there 
is zero hint in the logs as to why.





--
Jan Wieck




Re: inefficient loop in StandbyReleaseLockList()

2021-10-31 Thread Andres Freund
Hi,

On 2021-10-31 15:38:15 -0400, Tom Lane wrote:
> Yeah, there's no expectation that this data structure needs to be kept
> consistent after an error; and I'm not real sure that the existing
> code could claim to satisfy such a requirement if we did need it.

To be clear, I was making that comment in response to the search for
other places doing the while(!empty) delete_first() style loops. Some of
them are reached via resowners etc, where reaching the same code
repeatedly is perhaps more realistic.

Greetings,

Andres Freund




Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Andres Freund
Hi,

On 2021-10-31 15:43:57 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2021-10-31 10:59:19 -0400, Tom Lane wrote:
> >> No DBA would be likely to consider it as anything but log spam.
> 
> > I don't agree at all. No postgres instance should be run without
> > log_checkpoints enabled. Performance is poor if checkpoints are
> > triggered by anything but time, and that can only be diagnosed if
> > log_checkpoints is on.
> 
> This is complete nonsense.

Shrug. It's based on many years of doing or being around people doing
postgres support escalation shifts. And it's not like log_checkpoints
incurs meaningful overhead or causes that much log volume.


> If we think that's a generic problem, we should be fixing the problem
> (ie, making the checkpointer smarter);

We've made it less bad (checkpoint_segments -> max_wal_size, sorting IO
for checkpoints, forcing the OS to flush writes earlier). But it's still
a significant issue. It's not that easy to make it better.

The largest issues are the spikes in data write and WAL volumes. Of
course, tuning max_wal_size up helps, but that has its own set of issues
(we e.g. still PANIC on WAL ENOSPC).

One motivation for working on AIO/DIO is precisely to make checkpoints
less painful, FWIW. We are going to have to do something to reduce the
impact of FPWs at some point, but it's hard.

Greetings,

Andres Freund




Re: Add additional information to src/test/ssl/README

2021-10-31 Thread Daniel Gustafsson
> On 31 Oct 2021, at 19:28, Tom Lane  wrote:
> 
> I wrote:
>> Daniel Gustafsson  writes:
>>> Wouldn't it make more sense to start collecting troubleshooting advice in
>>> src/test/perl/README and instead refer to that in the boilerplate?  I notice
>>> that we don't document for example PG_TEST_NOCLEAN anywhere (which 
>>> admittedly
>>> is my fault), a trubleshooting section in that file would be a good fit.
> 
>> Yeah, we could try to move all the repetitive stuff to there.  I was a bit
>> allergic to the idea of having README files refer to webpages, because you
>> might be offline --- but referencing a different README doesn't have that
>> issue.
> 
> Here's a quick stab at rearranging src/test/perl/README so that the
> initial section is all how-to-run-the-tests info, and advice about
> writing new tests starts after that.  Your point about PG_TEST_NOCLEAN
> could be added into the initial section.

That's pretty much just what I was thinking, definitely +1 on this patch.

> I'd envision adding something like
> 
> "See src/test/perl/README for more info about running these tests."
> 
> to the boilerplate "Running the tests" section in src/test/ssl/README
> and its cohorts, but I didn't bother with that yet.

Sounds good.

--
Daniel Gustafsson   https://vmware.com/





Re: Time to drop plpython2?

2021-10-31 Thread Thomas Munro
On Mon, Nov 1, 2021 at 8:05 AM Andres Freund  wrote:
> > Yeah, it's a bit hard to justify continuing support in HEAD.

+1, it's dropping out of distros, it'd be unsupportable without
unreasonable effort.




Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Tom Lane
Andres Freund  writes:
> On 2021-10-31 10:59:19 -0400, Tom Lane wrote:
>> No DBA would be likely to consider it as anything but log spam.

> I don't agree at all. No postgres instance should be run without
> log_checkpoints enabled. Performance is poor if checkpoints are
> triggered by anything but time, and that can only be diagnosed if
> log_checkpoints is on.

This is complete nonsense.  If we think that's a generic problem, we
should be fixing the problem (ie, making the checkpointer smarter);
not expecting that DBAs will monitor their logs for an undocumented
issue.  The number of installations where that would actually happen
is epsilon compared to the number where it won't.

regards, tom lane




Re: inefficient loop in StandbyReleaseLockList()

2021-10-31 Thread Tom Lane
"Bossart, Nathan"  writes:
> On 10/28/21, 11:53 PM, "Michael Paquier"  wrote:
>> Actually, as the list of recovery locks is saved in TopMemoryContext,
>> wouldn't it be better to keep a per-cell deletion of the list, which
>> would mean that we'd better do the operation in the reverse order to
>> make things faster with the new list implementation?  But that's what
>> Andres points at with CFIs in the middle of one list of the hash table
>> processed?

> Hm.  IIUC anything bad enough to cause the startup process to break
> out of the StandbyReleaseLockList() loop will also cause the entire
> process to be restarted.  I'm not seeing any risk of reusing a half-
> released lock list.  I might be misunderstanding the concern, though.

Yeah, there's no expectation that this data structure needs to be kept
consistent after an error; and I'm not real sure that the existing
code could claim to satisfy such a requirement if we did need it.
(There's at least a short window where the caller's hash table entry
will point at an already-freed List.)

Pushed the patch as given.  I've not yet reviewed other list_delete_first
callers, but I'll take a look.  (I seem to remember that I did survey
them while writing 1cff1b95a, but I evidently missed that this code
could be dealing with a list long enough to be problematic.)

regards, tom lane




Re: Time to drop plpython2?

2021-10-31 Thread Andres Freund
Hi,

On 2021-10-31 14:49:22 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > To me it seems time to drop plpython2 support. Supporting plpython2
> > until ~7 years after python2 is EOL is already quite long... It'd be one
> > thing if it were completely painless, but imo it's not.
> 
> 7 years?  Oh, you're envisioning dropping plpython2 in HEAD but keeping
> it alive in the back branches.

Yea. It's annoying to support python2, but I don't think it's severe
enough to consider dropping support in released branches. We might get
to that point for the newer released versions, but for now...


> Yeah, it's a bit hard to justify continuing support in HEAD.

Cool.

Greetings,

Andres Freund




Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Andres Freund
Hi,

On 2021-10-31 10:59:19 -0400, Tom Lane wrote:
> The general policy at the moment is that a normally-functioning server
> should emit *no* log traffic by default (other than a few messages
> at startup and shutdown).  log_checkpoints is a particularly poor
> candidate for an exception to that policy, because it would produce so
> much traffic.

It shouldn't cause that much traffic - if the database is idle /
read-only, we don't trigger checkpoints. And these days that appears to
be reliable.


> No DBA would be likely to consider it as anything but log spam.

I don't agree at all. No postgres instance should be run without
log_checkpoints enabled. Performance is poor if checkpoints are
triggered by anything but time, and that can only be diagnosed if
log_checkpoints is on.


> > It seems the checkpoint stats, that are emitted to server logs when
> > the GUC log_checkpoints is enabled, are so important that a postgres
> > database provider would ever want to disable the GUC.
> 
> This statement seems ridiculous on its face.

It matches my experience. I think it's negligent to not turn it on on
anything but an absolutely toy database. And there it has practically no
cost.

The typical course is that suddenly the database has slowed down to a
crawl because checkpoints happen on an accelerated schedule due
max_wal_size. Without log_checkpoints one cannot diagnose this, and even
just turning on going forward isn't quite sufficient, because that
doesn't tell you whether checkpoints always were triggered by "xlog".


> If users need to wait with bated breath for a checkpoint report, we
> have something else we need to fix.

Well, we probably do. But it's not that easy to fundamentally improve
the situation.

Greetings,

Andres Freund




Re: emit recovery stats via a new file or a new hook

2021-10-31 Thread Andres Freund
Hi,

On 2021-10-31 19:06:07 +0530, Bharath Rupireddy wrote:
> It is sometimes super important to be able to answer customer
> questions like: What was the total time taken by the last recovery of
> the server? What was the time taken by each phase of recovery/redo
> processing of the startup process? Why did the recovery take so long?
> We've encountered these questions while dealing with the postgres
> customers. If these stats are available in an easily consumable
> fashion, it will be easier for us to understand, debug and identify
> root cause for "recovery taking a long time" problems, improve if
> possible and answer the customer questions. Also, these recovery stats
> can be read by an external analytical tool to show the recovery
> patterns to the customers directly. Although postgres emits some info
> via server logs thanks to the recent commit [3], it isn't easily
> consumable for the use cases that I mentioned.
> 
> Here are a few thoughts on how we could go about doing this. I
> proposed them earlier in [1],
> 1) capture and write recovery stats into a file
> 2) capture and emit recovery stats via a new hook
> 3) capture and write into a new system catalog table (assuming at the
> end of the recovery the database is in a consistent state, but I'm not
> sure if we ever update any catalog tables in/after the
> startup/recovery phase)
> 
> As Robert rightly suggested at [2], option (3) isn't an easy way to do
> that so we can park that idea aside, options (1) and (2) seem
> reasonable.

I don't think 1) is a good approach, because it just leads us down the
path of having dozens of log files. 2) isn't useful either, because
you'd need to load an extension library first, which users won't
have done before hitting the problem. And 3) isn't really possible.

I'm not sure that the new log messages aren't sufficient. But if they
aren't, it seems better to keep additional data in the stats system, and
make them visible via views, rather than adding yet another place to
keep stats.

Greetings,

Andres Freund




Re: Time to drop plpython2?

2021-10-31 Thread Tom Lane
Andres Freund  writes:
> To me it seems time to drop plpython2 support. Supporting plpython2
> until ~7 years after python2 is EOL is already quite long... It'd be one
> thing if it were completely painless, but imo it's not.

7 years?  Oh, you're envisioning dropping plpython2 in HEAD but keeping
it alive in the back branches.  Yeah, it's a bit hard to justify
continuing support in HEAD.

regards, tom lane




Time to drop plpython2?

2021-10-31 Thread Andres Freund
Hi,

The last release of python2 was 2.7.18 released 2020-04-20, with support
already having ended before that 2020-01-01.

To me it seems time to drop plpython2 support. Supporting plpython2
until ~7 years after python2 is EOL is already quite long... It'd be one
thing if it were completely painless, but imo it's not.

I was about to list better plpython2/3 support (including the regex
replacements for the regress tests) as a TODO for the meson proposal,
but it doesn't really seem worth investing in that...

Greetings,

Andres Freund




Re: Add additional information to src/test/ssl/README

2021-10-31 Thread Tom Lane
I wrote:
> Daniel Gustafsson  writes:
>> Wouldn't it make more sense to start collecting troubleshooting advice in
>> src/test/perl/README and instead refer to that in the boilerplate?  I notice
>> that we don't document for example PG_TEST_NOCLEAN anywhere (which admittedly
>> is my fault), a trubleshooting section in that file would be a good fit.

> Yeah, we could try to move all the repetitive stuff to there.  I was a bit
> allergic to the idea of having README files refer to webpages, because you
> might be offline --- but referencing a different README doesn't have that
> issue.

Here's a quick stab at rearranging src/test/perl/README so that the
initial section is all how-to-run-the-tests info, and advice about
writing new tests starts after that.  Your point about PG_TEST_NOCLEAN
could be added into the initial section.

I'd envision adding something like

"See src/test/perl/README for more info about running these tests."

to the boilerplate "Running the tests" section in src/test/ssl/README
and its cohorts, but I didn't bother with that yet.

regards, tom lane

diff --git a/src/test/perl/README b/src/test/perl/README
index 0e9a00ea05..f75b224175 100644
--- a/src/test/perl/README
+++ b/src/test/perl/README
@@ -12,22 +12,29 @@ $(prove_installcheck) targets in Makefile.global. By default every test in the
 t/ subdirectory is run. Individual test(s) can be run instead by passing
 something like PROVE_TESTS="t/001_testname.pl t/002_othertestname.pl" to make.
 
-You should prefer to write tests using pg_regress in src/test/regress, or
-isolation tester specs in src/test/isolation, if possible. If not, check to
-see if your new tests make sense under an existing tree in src/test, like
-src/test/ssl, or should be added to one of the suites for an existing utility.
-
-Note that all tests and test tools should have perltidy run on them before
-patches are submitted, using perltidy --profile=src/tools/pgindent/perltidyrc
-
 By default, to keep the noise low during runs, we do not set any flags via
 PROVE_FLAGS, but this can be done on the 'make' command line if desired, eg:
 
 make check-world PROVE_FLAGS='--verbose'
 
+When a test fails, the terminal output from 'prove' is usually not sufficient
+to diagnose the problem.  Look into the log files that are left under
+tmp_check/log/ to get more info.  Files named 'regress_XXX' are log output
+from the perl test scripts themselves, and should be examined first.
+Other files are postmaster logs, and may be helpful as additional data.
+
+
 Writing tests
 -
 
+You should prefer to write tests using pg_regress in src/test/regress, or
+isolation tester specs in src/test/isolation, if possible. If not, check to
+see if your new tests make sense under an existing tree in src/test, like
+src/test/ssl, or should be added to one of the suites for an existing utility.
+
+Note that all tests and test tools should have perltidy run on them before
+patches are submitted, using perltidy --profile=src/tools/pgindent/perltidyrc
+
 Tests are written using Perl's Test::More with some PostgreSQL-specific
 infrastructure from src/test/perl providing node management, support for
 invoking 'psql' to run queries and get results, etc. You should read the


Logical insert/update/delete WAL records for custom table AMs

2021-10-31 Thread Jeff Davis
Attached is a WIP patch to add new WAL records to represent a logical
insert, update, or delete. These records do not do anything at REDO
time, they are only processed during logical decoding/replication.

These are intended to be used by a custom table AM, like my columnar
compression extension[0], which currently supports physical replication
but can't support logical decoding/replication because decoding is not
extensible. Using these new logical records would be redundant, making
inserts/updates/deletes less efficient, but at least logical decoding
would work (the lack of which is columnar's biggest weakness).

Alternatively, we could support extensible WAL with extensible
decoding. I also like this approach, but it takes more work for an AM
like columnar to get that right -- it needs to keep additional state in
the walsender to track and assemble the compressed columns stored
across many blocks. It also requires a lot of care, because mistakes
can get you into serious trouble.

This proposal, for new logical records without WAL extensibility,
provides a more shallow ramp to get a table AM working (including
logical replication/decoding) without the need to invest in the WAL
design. Later, of course I'd like the option for extensible WAL as well
(to be more efficient), but right now I'd prefer it just worked
(inefficiently).

The patch is still very rough, but I tried in simple insert cases in my
columnar[0] extension (which only supports insert, not update/delete).
I'm looking for some review on the approach and structure before I
polish and test it. Note that my main test case is columnar, which
doesn't support update/delete. Also note that the patch is against v14
(for now).

Regards,
Jeff Davis

[0] https://github.com/citusdata/citus/tree/master/src/backend/columnar
diff --git a/src/backend/access/rmgrdesc/Makefile b/src/backend/access/rmgrdesc/Makefile
index f88d72fd862..ed6dff179be 100644
--- a/src/backend/access/rmgrdesc/Makefile
+++ b/src/backend/access/rmgrdesc/Makefile
@@ -18,7 +18,7 @@ OBJS = \
 	gistdesc.o \
 	hashdesc.o \
 	heapdesc.o \
-	logicalmsgdesc.o \
+	logicaldesc.o \
 	mxactdesc.o \
 	nbtdesc.o \
 	relmapdesc.o \
diff --git a/src/backend/access/rmgrdesc/logicalmsgdesc.c b/src/backend/access/rmgrdesc/logicaldesc.c
similarity index 59%
rename from src/backend/access/rmgrdesc/logicalmsgdesc.c
rename to src/backend/access/rmgrdesc/logicaldesc.c
index d64ce2e7eff..079ab5a847e 100644
--- a/src/backend/access/rmgrdesc/logicalmsgdesc.c
+++ b/src/backend/access/rmgrdesc/logicaldesc.c
@@ -1,22 +1,22 @@
 /*-
  *
- * logicalmsgdesc.c
- *	  rmgr descriptor routines for replication/logical/message.c
+ * logicaldesc.c
+ *	  rmgr descriptor routines for replication/logical/logical_xlog.c
  *
  * Portions Copyright (c) 2015-2021, PostgreSQL Global Development Group
  *
  *
  * IDENTIFICATION
- *	  src/backend/access/rmgrdesc/logicalmsgdesc.c
+ *	  src/backend/access/rmgrdesc/logicaldesc.c
  *
  *-
  */
 #include "postgres.h"
 
-#include "replication/message.h"
+#include "replication/logical_xlog.h"
 
 void
-logicalmsg_desc(StringInfo buf, XLogReaderState *record)
+logical_desc(StringInfo buf, XLogReaderState *record)
 {
 	char	   *rec = XLogRecGetData(record);
 	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
@@ -40,13 +40,42 @@ logicalmsg_desc(StringInfo buf, XLogReaderState *record)
 			sep = " ";
 		}
 	}
+	else if (info == XLOG_LOGICAL_INSERT)
+	{
+
+	}
+	else if (info == XLOG_LOGICAL_UPDATE)
+	{
+
+	}
+	else if (info == XLOG_LOGICAL_DELETE)
+	{
+
+	}
+	else if (info == XLOG_LOGICAL_TRUNCATE)
+	{
+
+	}
 }
 
 const char *
-logicalmsg_identify(uint8 info)
+logical_identify(uint8 info)
 {
-	if ((info & ~XLR_INFO_MASK) == XLOG_LOGICAL_MESSAGE)
-		return "MESSAGE";
+	switch (info)
+	{
+		case XLOG_LOGICAL_MESSAGE:
+			return "MESSAGE";
+		case XLOG_LOGICAL_INSERT:
+			return "INSERT";
+		case XLOG_LOGICAL_UPDATE:
+			return "UPDATE";
+		case XLOG_LOGICAL_DELETE:
+			return "DELETE";
+		case XLOG_LOGICAL_TRUNCATE:
+			return "TRUNCATE";
+		default:
+			return NULL;
+	}
 
 	return NULL;
 }
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 58091f6b520..f31f7187ac4 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -24,7 +24,7 @@
 #include "commands/dbcommands_xlog.h"
 #include "commands/sequence.h"
 #include "commands/tablespace.h"
-#include "replication/message.h"
+#include "replication/logical_xlog.h"
 #include "replication/origin.h"
 #include "storage/standby.h"
 #include "utils/relmapper.h"
diff --git a/src/backend/replication/logical/Makefile b/src/backend/replication/logical/Makefile
index c4e2fdeb719..9fa281a1dca 100644
--- a/src/backend/replication/logical/Makefile
+++ b/src/backend/replication/logical/Makefile
@@ -19,7 +19,7 @@ OBJS = \
 	launcher.o \
 	

Re: Feature request for adoptive indexes

2021-10-31 Thread Tomas Vondra




On 10/31/21 16:48, Pavel Borisov wrote:

   4 columns: 106 ms
   6 columns: 109 ms

So there's like 3% difference between the two cases, and even that
might
be just noise. This is consistent with the two indexes being about the
same size.

I also don't think we can get great speedup in the mentioned case, so it 
is not urgently needed of course. My point is that it is just nice to 
have a multicolumn index constructed on stacked trees constructed on 
separate columns, not on the index tuples as a whole thing.


Well, I'd say "nice to have" features are pointless unless they actually 
give tangible benefits (like speedup) to users. I'd bet no one is going 
to implement and maintain something unless it has such benefit, because 
they have to weight it against other beneficial features.


Maybe there are use cases where this would be beneficial, but so far we 
haven't seen one. Usually it's the OP who presents such a case, and a 
plausible way to improve it - but it seems this thread presents a 
solution and now we're looking for an issue it might solve.



At least there is a benefit of sparing shared memory if we don't need
to cache index tuples of several similar indexes, instead caching one
"compound index". So if someone wants to propose this thing I'd
support it provided problems with concurrency, which were mentioned
by Peter are solved.



The problem with this it assumes the new index would use (significantly) 
less space than three separate indexes. I find that rather unlikely, but 
maybe there is a smart way to achieve that (certainly not in detail).


I don't want to sound overly pessimistic and if you have an idea how to 
do this, I'd like to hear it. But it seems pretty tricky, particularly 
if we assume the suffix columns are more variable (which limits the 
"compression" ratio etc.).


These problems could be appear easy though, as we have index tuples 
constructed in a similar way as heap tuples. Maybe it could be easier if 
we had another heap am, which stored separate attributes (if so it could 
be useful as a better JSON storage method than we have today).




IMO this just moved the goalposts somewhere outside the solar system.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Feature request for adoptive indexes

2021-10-31 Thread Pavel Borisov
>
>   4 columns: 106 ms
>   6 columns: 109 ms
>
> So there's like 3% difference between the two cases, and even that might
> be just noise. This is consistent with the two indexes being about the
> same size.
>
I also don't think we can get great speedup in the mentioned case, so it is
not urgently needed of course. My point is that it is just nice to have a
multicolumn index constructed on stacked trees constructed on separate
columns, not on the index tuples as a whole thing. At least there is a
benefit of sparing shared memory if we don't need to cache index tuples of
several similar indexes, instead caching one "compound index". So if
someone wants to propose this thing I'd support it provided problems with
concurrency, which were mentioned by Peter are solved.

These problems could be appear easy though, as we have index tuples
constructed in a similar way as heap tuples. Maybe it could be easier if we
had another heap am, which stored separate attributes (if so it could be
useful as a better JSON storage method than we have today).

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Re: should we enable log_checkpoints out of the box?

2021-10-31 Thread Tom Lane
Bharath Rupireddy  writes:
> How about we enable it out of the box?

No.

The general policy at the moment is that a normally-functioning server
should emit *no* log traffic by default (other than a few messages
at startup and shutdown).  log_checkpoints is a particularly poor
candidate for an exception to that policy, because it would produce so
much traffic.  No DBA would be likely to consider it as anything but
log spam.

> It seems the checkpoint stats, that are emitted to server logs when
> the GUC log_checkpoints is enabled, are so important that a postgres
> database provider would ever want to disable the GUC.

This statement seems ridiculous on its face.  If users need to wait
with bated breath for a checkpoint report, we have something else
we need to fix.

regards, tom lane




should we enable log_checkpoints out of the box?

2021-10-31 Thread Bharath Rupireddy
Hi,

It seems the checkpoint stats, that are emitted to server logs when
the GUC log_checkpoints is enabled, are so important that a postgres
database provider would ever want to disable the GUC. Right now, the
GUC is disabled by default and a postgres database provider needs to
enable it explicitly. How about we
enable it out of the box? One disadvantage is that it may fill up the
server logs if the checkpoints are so frequent. However the
checkpoints stats are captured even when the GUC is disabled [1].

Thoughts?

[1]
/*
 * Prepare to accumulate statistics.
 *
 * Note: because it is possible for log_checkpoints to change while a
 * checkpoint proceeds, we always accumulate stats, even if
 * log_checkpoints is currently off.
 */
MemSet(, 0, sizeof(CheckpointStats));

Regards,
Bharath Rupireddy.




emit recovery stats via a new file or a new hook

2021-10-31 Thread Bharath Rupireddy
Hi,

It is sometimes super important to be able to answer customer
questions like: What was the total time taken by the last recovery of
the server? What was the time taken by each phase of recovery/redo
processing of the startup process? Why did the recovery take so long?
We've encountered these questions while dealing with the postgres
customers. If these stats are available in an easily consumable
fashion, it will be easier for us to understand, debug and identify
root cause for "recovery taking a long time" problems, improve if
possible and answer the customer questions. Also, these recovery stats
can be read by an external analytical tool to show the recovery
patterns to the customers directly. Although postgres emits some info
via server logs thanks to the recent commit [3], it isn't easily
consumable for the use cases that I mentioned.

Here are a few thoughts on how we could go about doing this. I
proposed them earlier in [1],
1) capture and write recovery stats into a file
2) capture and emit recovery stats via a new hook
3) capture and write into a new system catalog table (assuming at the
end of the recovery the database is in a consistent state, but I'm not
sure if we ever update any catalog tables in/after the
startup/recovery phase)

As Robert rightly suggested at [2], option (3) isn't an easy way to do
that so we can park that idea aside, options (1) and (2) seem
reasonable.

Thoughts?

[1] - 
https://www.postgresql.org/message-id/CALj2ACUwb3x%2BJFHkXp4Lf603Q3qFgK0P6kSsJvZkH4QAvGv4ig%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CA%2BTgmoZ0b7JkNexaoGDXJ%3D8Zq%2B_NFZBek1oyyPU%2BDDsRi1dsCw%40mail.gmail.com
[3] - commit 9ce346eabf350a130bba46be3f8c50ba28506969
Author: Robert Haas 
Date:   Mon Oct 25 11:51:57 2021 -0400

Report progress of startup operations that take a long time.

Regards,
Bharath Rupireddy.




Synchronizing slots from primary to standby

2021-10-31 Thread Peter Eisentraut
I want to reactivate $subject.  I took Petr Jelinek's patch from [0], 
rebased it, added a bit of testing.  It basically works, but as 
mentioned in [0], there are various issues to work out.


The idea is that the standby runs a background worker to periodically 
fetch replication slot information from the primary.  On failover, a 
logical subscriber would then ideally find up-to-date replication slots 
on the new publisher and can just continue normally.


The previous thread didn't have a lot of discussion, but I have gathered 
from off-line conversations that there is a wider agreement on this 
approach.  So the next steps would be to make it more robust and 
configurable and documented.  As I said, I added a small test case to 
show that it works at all, but I think a lot more tests should be added. 
 I have also found that this breaks some seemingly unrelated tests in 
the recovery test suite.  I have disabled these here.  I'm not sure if 
the patch actually breaks anything or if these are just differences in 
timing or implementation dependencies.  This patch adds a LIST_SLOTS 
replication command, but I think this could be replaced with just a 
SELECT FROM pg_replication_slots query now.  (This patch is originally 
older than when you could run SELECT queries over the replication protocol.)


So, again, this isn't anywhere near ready, but there is already a lot 
here to gather feedback about how it works, how it should work, how to 
configure it, and how it fits into an overall replication and HA 
architecture.



[0]: 
https://www.postgresql.org/message-id/flat/3095349b-44d4-bf11-1b33-7eefb585d578%402ndquadrant.com
From 3b3c3fa1d1e92d6b39ab0c869cb9398bf7791d48 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 31 Oct 2021 10:49:29 +0100
Subject: [PATCH v1] Synchronize logical replication slots from primary to
 standby

---
 src/backend/commands/subscriptioncmds.c   |   4 +-
 src/backend/postmaster/bgworker.c |   3 +
 .../libpqwalreceiver/libpqwalreceiver.c   |  73 
 src/backend/replication/logical/Makefile  |   1 +
 src/backend/replication/logical/launcher.c| 199 +++
 src/backend/replication/logical/slotsync.c| 311 ++
 src/backend/replication/logical/tablesync.c   |  13 +-
 src/backend/replication/repl_gram.y   |  13 +-
 src/backend/replication/repl_scanner.l|   1 +
 src/backend/replication/slotfuncs.c   |   2 +-
 src/backend/replication/walsender.c   | 169 +-
 src/backend/utils/activity/wait_event.c   |   3 +
 src/include/commands/subscriptioncmds.h   |   3 +
 src/include/nodes/nodes.h |   1 +
 src/include/nodes/replnodes.h |   8 +
 src/include/replication/logicalworker.h   |   1 +
 src/include/replication/slot.h|   4 +-
 src/include/replication/walreceiver.h |  16 +
 src/include/replication/worker_internal.h |   8 +-
 src/include/utils/wait_event.h|   1 +
 src/test/recovery/t/007_sync_rep.pl   |   3 +-
 .../t/010_logical_decoding_timelines.pl   |   3 +-
 src/test/recovery/t/030_slot_sync.pl  |  51 +++
 23 files changed, 819 insertions(+), 72 deletions(-)
 create mode 100644 src/backend/replication/logical/slotsync.c
 create mode 100644 src/test/recovery/t/030_slot_sync.pl

diff --git a/src/backend/commands/subscriptioncmds.c 
b/src/backend/commands/subscriptioncmds.c
index c47ba26369..2bab813440 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -743,7 +743,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
RemoveSubscriptionRel(sub->oid, relid);
 
-   logicalrep_worker_stop(sub->oid, relid);
+   logicalrep_worker_stop(MyDatabaseId, sub->oid, 
relid);
 
/*
 * For READY state, we would have already 
dropped the
@@ -1244,7 +1244,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool 
isTopLevel)
{
LogicalRepWorker *w = (LogicalRepWorker *) lfirst(lc);
 
-   logicalrep_worker_stop(w->subid, w->relid);
+   logicalrep_worker_stop(w->dbid, w->subid, w->relid);
}
list_free(subworkers);
 
diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index c05f500639..818b8a35e9 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -128,6 +128,9 @@ static const struct
},
{
"ApplyWorkerMain", ApplyWorkerMain
+   },
+   {
+   "ReplSlotSyncMain", ReplSlotSyncMain
}
 };
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c 
b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 5c6e56a5b2..5d2871eb08 100644
--- 

postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

2021-10-31 Thread Etsuro Fujita
Hi,

As I said before [1], I’m working on $SUBJECT.  Attached is a WIP
patch for that.  The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled, 1) asynchronously send
COMMIT TRANSACTION (RELEASE SAVEPOINT) to all remote servers involved
in a local (sub)transaction, then 2) wait for the results from the
remote servers in the order that the command was sent to the remote
servers, when called from pgfdw_xact_callback (pgfdw_subxact_callback)
during pre-commit.  The patch also parallelizes clearing prepared
statements the same way during pre-commit.  (The option is false by
default.)

I evaluated the effectiveness of the patch using a simple
multi-statement transaction:

BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
RELEASE SAVEPOINT s;
COMMIT;

where ft1 and ft2 are foreign tables created on different foreign
servers hosted on different machines.  I ran the transaction five
times using the patch with the option enabled/disabled, and measured
the latencies for the RELEASE and COMMIT commands in each run.  The
average latencies for these commands over the five runs are:

* RELEASE
  parallel_commit=0: 0.385 ms
  parallel_commit=1: 0.221 ms

* COMMIT
  parallel_commit=0: 1.660 ms
  parallel_commit=1: 0.861 ms

With the option enabled, the average latencies for both commands are
reduced significantly!

I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort.  Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.

Best regards,
Etsuro Fujita

[1] 
https://www.postgresql.org/message-id/CAPmGK177E6HPcCQB4-s%2Bm9AcCZDHCC2drZy%2BFKnnvEXw9kXoXQ%40mail.gmail.com


postgres-fdw-parallel-commit-1.patch
Description: Binary data