Re: Mark all GUC variable as PGDLLIMPORT

2022-02-12 Thread Julien Rouhaud
Hi,

On Sat, Feb 12, 2022 at 07:59:56PM -0800, Andres Freund wrote:
> 
> On 2021-08-23 14:53:34 +0800, Julien Rouhaud wrote:
> > So since the non currently explicitly exported GUC global variables 
> > shouldn't
> > be accessible by third-party code, I'm attaching a POC patch that does the
> > opposite of v1: enforce that restriction using a new pg_attribute_hidden()
> > macro, defined with GCC only, to start discussing that topic.
> 
> This fails on cfbot: https://cirrus-ci.com/task/6424663592009728?logs=build#L1
> 
> I'm not feeling a lot of enthusiasm for the change.

Yes, which is also why I'm not planning on spending more effort on that (or the
opposite) unless I get some sort of feedback, so thanks a lot for the answer
here.

Note that if everyone is happy with the status quo please say so.  I will
happily mark the CF entry rejected and stop all efforts to try packaging
extensions on Windows.  It will save me a lot of efforts that 99% of users
don't care about.

If not maybe we could improve the situation, and also learn for the max_backend
thread.

Maybe we could have an actually usable GUC API to retrieve values in their
native format rather than C string for instance, that we could make sure also
works for cases like max_backend?

> But if we were to do this,
> we'd have to have infrastructure to detect missing hidden declarations,
> otherwise it's inevitable that they don't get added.

Agreed.  For now I'm using a simple loop around

egrep "^\s+&[A-Za-z_]+,$" src/backend/utils/misc/guc.c | egrep -o "[A-Za-z_]+"

to get all the underlying symbols, and grep that again (actually using ag,
which makes it 3x faster) to detect the lack of wanted annotation.  Were you
thinking of some script like that, maybe to run before a release, or something
more evolved?

> I kind of like the idea of hiding postgres symbols for other reasons than
> win<->everything else parity. Namely that not exporting symbols can allow the
> compiler to optimize more...

Yeah I saw that in nearby threads and I entirely agree.




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-12 Thread Dilip Kumar
On Sun, Feb 13, 2022 at 10:12 AM Dilip Kumar  wrote:
>

I have done performance testing with different template DB sizes and
different amounts of dirty shared buffers and I think as expected the
bigger the dirty shared buffer the checkpoint approach becomes costly
and OTOH the larger the template DB size the WAL log approach takes
more time.

I think it is very common to have larger shared buffers and of course,
if somebody has configured such a large shared buffer then a good % of
it will be dirty most of the time.  So IMHO in the future, the WAL log
approach is going to be more usable in general.  However, this is just
my opinion, and others may have completely different thoughts and
anyhow we are keeping options for both the approaches so no worry.

Next, I am planning to do some more tests, where we are having pgbench
running and concurrently we do CREATEDB maybe every 1 minute and see
what is the CREATEDB time as well as what is the impact on pgbench
performance.  Because currently I have only measured CREATEDB time but
we must be knowing the impact of createdb on the other system as well.

Test setup:
max_wal_size=64GB
checkpoint_timeout=15min
- CREATE base TABLE of size of Shared Buffers
- CREATE template database and table in it of varying sizes (as per test)
- CHECKPOINT (write out dirty buffers)
- UPDATE 70% of tuple in base table (dirty 70% of shared buffers)
- CREATE database using template db. (Actual test target)

test1:
1 GB shared buffers, template DB size = 6MB, dirty shared buffer=70%
Head: 2341.665 ms
Patch: 85.229 ms

test2:
1 GB shared buffers, template DB size = 1GB, dirty shared buffer=70%
Head: 4044 ms
Patch: 8376 ms

test3:
8 GB shared buffers, template DB size = 1GB, dirty shared buffer=70%
Head: 21398 ms
Patch: 9834 ms

test4:
8 GB shared buffers, template DB size = 10GB, dirty shared buffer=95%
Head: 38574 ms
Patch: 77160 ms

test4:
32 GB shared buffers, template DB size = 10GB, dirty shared buffer=70%
Head: 47656 ms
Patch: 79767 ms

test5:
64 GB shared buffers, template DB size = 1GB, dirty shared buffer=70%
Head: 59151 ms
Patch: 8742 ms

test6:
64 GB shared buffers, template DB size = 50GB, dirty shared buffer=50%
Head: 171614 ms
Patch: 406040 ms

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-13 18:07:30 +1300, Thomas Munro wrote:
> On Sun, Feb 13, 2022 at 5:50 PM Andres Freund  wrote:
> >> > +# required for 027_stream_regress.pl
> >> > +REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/test/recovery
> >> > +export REGRESS_OUTPUTDIR
> >>
> >> Why do we need this?
> >
> > The Make macro "prove_check" (src/Makefile.global.in) always changes
> > to the source directory to run TAP tests. Without an explicit
> > directive to control where regression test output goes, it got
> > splattered all over the source tree in VPATH builds. I didn't see an
> > existing way to adjust that (did I miss something?). Hence desire to
> > pass down a path in the build tree. Better ideas welcome.
> 
> I thought it was a goal that VPATH builds shouldn't pollute the source
> tree, but the Make macro prove_check is explicitly doing so by
> default.  Perhaps *that* should be fixed?

Sure, prove changes into the source dir. But we don't put test data / output
into the source? That's what TESTDIR is used for:

# Determine output directories, and create them.  The base path is the
# TESTDIR environment variable, which is normally set by the invoking
# Makefile.
$tmp_check = $ENV{TESTDIR} ? "$ENV{TESTDIR}/tmp_check" : "tmp_check";
$log_path = "$tmp_check/log";

Afaics all the "regress test inside tap test" cases would need to do is to pass
--outputdir=${PostgreSQL::Test::Utils::tmp_check} and you'd get exactly the 
same path as
REGRESS_OUTPUTDIR currently provides.

I only use vpath builds, and I don't see any tap test data / log in the source
tree

Greetings,

Andres Freund




Re: Adding CI to our tree

2022-02-12 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 05:20:08PM -0800, Andres Freund wrote:
> On 2022-02-03 23:04:04 -0600, Justin Pryzby wrote:
> > > I'm a bit worried about the increased storage and runtime overhead due to 
> > > the
> > > docs changes. We probably can make it a good bit cheaper though.
> >
> > If you mean overhead of additional disk operations, it shouldn't be an 
> > issue,
> > since the git clone uses shared references (not even hardlinks).
> 
> I was concerned about those and just the increased runtime of the script. Our
> sources are 130MB, leaving the shared .git aside. But maybe it's just fine.
> 
> We probably can just get rid of the separate clone and configure run though?
> Build the docs, copy the output, do a git co parent docs/, build again?

Yes - works great, thanks.

> What was the reason behind moving the docs stuff from the compiler warnings
> task to linux?

I wanted to build docs even if the linux task fails.  To allow CFBOT to link to
them, so somoene can always review the docs, in HTML (rather than reading SGML
with lines prefixed with +).

> Not that either fits very well... I think it might be worth
> moving the docs stuff into its own task, using a 1 CPU container (docs build
> isn't parallel anyway). Think that'll be easier to see in the cfbot page. I

Yeah.  The only drawback is the duplication (including the "git parent" stuff).

BTW, docs can be built in parallel, and CI is using BUILD_JOBS: 4.
/usr/bin/xmllint --path . --noout --valid postgres.sgml
/usr/bin/xmllint --path . --noout --valid postgres.sgml
/usr/bin/xsltproc --path . --stringparam pg.version '15devel'  stylesheet.xsl 
postgres.sgml
/usr/bin/xsltproc --path . --stringparam pg.version '15devel'  
stylesheet-man.xsl postgres.sgml

> 1) iterate over release branches and see which has the smallest diff

Maybe for each branch: do echo `git revlist or merge base |wc -l` $branch; done 
|sort -n |head -1

> > > Is looking at the .c files in the change really a reliable predictor of 
> > > where
> > > code coverage changes? I'm doubtful. Consider stuff like removing the last
> > > user of some infrastructure or such. Or adding the first.
> >
> > You're right that it isn't particularly accurate, but it's a step forward if
> > lots of patches use it to check/improve coverge of new code.
> 
> Maybe it's good enough... The overhead in test runtime is noticeable (~5.30m
> -> ~7.15m), but probably acceptable. Although I also would like to enable
> -fsanitize=alignment and -fsanitize=alignment, which add about 2 minutes as
> well (-fsanitize=address is a lot more expensive), they both work best on
> linux.

There's other things that it'd be nice to enable, but maybe these don't need to
be on by default.  Maybe just have a list of optional compiler flags (and hints
when they're useful).  Like WRITE_READ_PARSE_PLAN_TREES.

> > In addition to the HTML generated for changed .c files, it's possible to 
> > create
> > a coverage.gcov output for everything, which could be retrieved to generate
> > full HTML locally.  It's ~8MB (or 2MB after gzip).
> 
> Note sure that doing doing it locally adds much over just running tests
> locally.

You're right, since one needs to have the patched source files to generate the
HTML.

On Thu, Feb 03, 2022 at 11:57:18AM -0800, Andres Freund wrote:
> I think it may be a bit cleaner to have the "install additional packages"
> "template step" be just that, and not merge in other contents into it?

I renamed the "cores" task since it consistently makes me think you're doing
with CPU cores.  It took it as an opportunity to generalize the task.

These patches are ready for review.  I'll plan to mail about TAP stuff
tomorrow.
>From 05c24a1e679e5ae0dd0cb6504d397f77c8b1fc5c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/3] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a features is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index dd96a97efe5..eda8ac9596c 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,11 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+#pkg install -y ...
 
   # NB: Intentionally build without --with-llvm. The freebsd image size is
   # already large enough to make VM startup slow, and even without llvm
@@ -180,10 +181,12 @@ task:
 chown -R postgres:postgres ${CCACHE_DIR}
 echo '* - memlock 134217728' > /etc/security/limits.d/postgres.conf
 su postgres -c "ulimit -l -H && ulimit -l -S"
-  setup_cores_script: |
+  setup_os_script: |
   

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-12 Thread Tom Lane
Thomas Munro  writes:
> I thought it was a goal that VPATH builds shouldn't pollute the source
> tree, but the Make macro prove_check is explicitly doing so by
> default.  Perhaps *that* should be fixed?

Indeed.  That seems broken by definition.

More generally, I thought we'd established a convention that
all files made by TAP tests should be put inside the tmp_check
directory, to simplify cleanup and .gitignore rules.  But in
a VPATH build, tmp_check ought to live in the build tree.

regards, tom lane




Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-12 Thread Thomas Munro
On Sun, Feb 13, 2022 at 5:50 PM Andres Freund  wrote:
> On 2022-01-18 11:20:16 +0900, Michael Paquier wrote:
> > +REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/bin/pg_upgrade
> > +export REGRESS_OUTPUTDIR
>
> I don't really understand why 027_stream_regress.pl is using this (and thus
> not why it's used here). The tap tests create files all the time, why is this
> different?
>
> It's not like make / msvc put the data in different places:
> src/test/recovery/Makefile:REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/test/recovery/tmp_check
> src/tools/msvc/vcregress.pl: $ENV{REGRESS_OUTPUTDIR} = 
> "$topdir/src/test/recovery/tmp_check";

As I wrote in 
https://www.postgresql.org/message-id/CA%2BhUKGK-%2Bmg6RWiDu0JudF6jWeL5%2BgPmi8EKUm1eAzmdbwiE_A%40mail.gmail.com,

>> > +# required for 027_stream_regress.pl
>> > +REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/test/recovery
>> > +export REGRESS_OUTPUTDIR
>>
>> Why do we need this?
>
> The Make macro "prove_check" (src/Makefile.global.in) always changes
> to the source directory to run TAP tests. Without an explicit
> directive to control where regression test output goes, it got
> splattered all over the source tree in VPATH builds. I didn't see an
> existing way to adjust that (did I miss something?). Hence desire to
> pass down a path in the build tree. Better ideas welcome.

I thought it was a goal that VPATH builds shouldn't pollute the source
tree, but the Make macro prove_check is explicitly doing so by
default.  Perhaps *that* should be fixed?




Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-12 Thread Andres Freund
Hi,

On 2022-01-18 11:20:16 +0900, Michael Paquier wrote:
> +# required for 002_pg_upgrade.pl
> +REGRESS_SHLIB=$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX)
> +export REGRESS_SHLIB

It seems weird to propagate this into multiple places. Why don't we define
that centrally?

Although it's weird for this to use REGRESS_SHLIB, given it's just doing
dirname() on it. 027_stream_regress.pl has the "defense" of not wanting to
duplicate the variable with 017_shm.pl...

Not that I understand why 017_shm.pl and all the regression test source
fileseven need $(DLSUFFIX) - expand_dynamic_library_name() should take care of
it?


> +REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/bin/pg_upgrade
> +export REGRESS_OUTPUTDIR

I don't really understand why 027_stream_regress.pl is using this (and thus
not why it's used here). The tap tests create files all the time, why is this
different?

It's not like make / msvc put the data in different places:
src/test/recovery/Makefile:REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/test/recovery/tmp_check
src/tools/msvc/vcregress.pl: $ENV{REGRESS_OUTPUTDIR} = 
"$topdir/src/test/recovery/tmp_check";

> +
> +# From now on, the test of pg_upgrade consists in setting up an instance.

What does "from now on" mean?



> +# Default is the location of this source code for both nodes used with
> +# the upgrade.

Can't quite parse.



> +# Initialize a new node for the upgrade.  This is done early so as it is
> +# possible to know with which node's PATH the initial dump needs to be
> +# taken.
> +my $newnode = PostgreSQL::Test::Cluster->new('new_node');
> +$newnode->init(extra => [ '--locale=C', '--encoding=LATIN1' ]);
> +my $newbindir = $newnode->config_data('--bindir');
> +my $oldbindir = $oldnode->config_data('--bindir');

Why C/LATIN?

Right now pg_upgrade test.sh uses --wal-segsize 1, and that has helped
identify several bugs. So I'd rather not give it up, even if it's a bit weird.



> + my @regress_command = [
> + $ENV{PG_REGRESS},
> + '--schedule', "$oldsrc/src/test/regress/parallel_schedule",
> + '--bindir',   $oldnode->config_data('--bindir'),
> + '--dlpath',   $dlpath,
> + '--port', $oldnode->port,
> + '--outputdir',$outputdir,
> + '--inputdir', $inputdir,
> + '--use-existing'
> + ];

I think this should use --host (c.f. 7340aceed72). Or is it intending to use
the host via env? If so, why is the port specified?


> + @regress_command = (@regress_command, @extra_opts);
> +
> + $oldnode->command_ok(@regress_command,
> + 'regression test run on old instance');

I also think this should take EXTRA_REGRESS_OPTS into account - test.sh did.


> +# After dumping, update references to the old source tree's regress.so
> +# to point to the new tree.
> +if (defined($ENV{oldinstall}))
> +{


Kinda asking for its own function...

> +
> +# Update the instance.
> +$oldnode->stop;
> +
> +# Time for the real run.

As opposed to the unreal one?


> +# pg_upgrade would complain if PGHOST, so as there are no attempts to
> +# connect to a different server than the upgraded ones.

"complain if PGHOST"?


> +# Take a second dump on the upgraded instance.

Sounds like you're taking to post-upgrade pg_dumps.




Greetings,

Andres Freund




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-12 Thread Dilip Kumar
On Sat, Feb 12, 2022 at 2:38 AM Alvaro Herrera  wrote:
>
> On 2022-Feb-11, Robert Haas wrote:
>
> > What I find difficult about doing that is that this is all a bunch of
> > technical details that users may have difficulty understanding. If we
> > say WAL_LOG or WAL_LOG_DATA, a reasonably but not incredibly
> > well-informed user will assume that skipping WAL is not really an
> > option. If we say CHECKPOINT, a reasonably but not incredibly
> > well-informed user will presume they don't want one (I think).
> > CHECKPOINT also seems like it's naming the switch by the unwanted side
> > effect, which doesn't seem too flattering to the existing method.
>
> It seems you're thinking deciding what to do based on an option that
> gets a boolean argument.  But what about making the argument be an enum?
> For example
>
> CREATE DATABASE ... WITH (STRATEGY = LOG);  -- default if option is 
> omitted
> CREATE DATABASE ... WITH (STRATEGY = CHECKPOINT);
>
> So the user has to think about it in terms of some strategy to choose,
> rather than enabling or disabling some flag with nontrivial
> implications.


Yeah I think being explicit about giving the strategy to the user
looks like a better option.  Now they can choose whether they want it
to create using WAL log or using CHECKPOINT.  Otherwise, if we give a
flag then we will have to give an explanation that if they choose not
to WAL log then we will have to do a checkpoint internally.  So I
think giving LOG vs CHECKPOINT as an explicit option looks better to
me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-12 Thread Tom Lane
Michael Paquier  writes:
> On Sat, Feb 12, 2022 at 05:16:03PM -0500, Tom Lane wrote:
>> I think adding an explicit PGAC_PATH_PROGS would be worth the cycles.

> Well, this somebody is I, and the buildfarm did not blow up on any of
> that so that looked rather fine.

Eh?  copperhead for one is failing for exactly this reason:

Bailout called.  Further testing stopped:  failed to execute command "lz4 -d -m 
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4":
 No such file or directory

> Adding a few cycles for this check
> is fine by me.  What do you think of the attached?

Looks OK as far as it goes ... but do we need anything on the
MSVC side?

Also, I can't help wondering why we have both GZIP_PROGRAM and GZIP
variables.  I suppose that's a separate issue though.

regards, tom lane




Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-12 Thread Andres Freund
Hi,

On 2022-01-18 11:20:16 +0900, Michael Paquier wrote:
> On Sat, Jan 15, 2022 at 01:52:39PM +0800, Julien Rouhaud wrote:
> > libpq environment variable PGHOST has a non-local server value: 
> > C:/Users/ContainerAdministrator/AppData/Local/Temp/FhBIlsw6SV
> > Failure, exiting
> > not ok 3 - run of pg_upgrade for new instance
> 
> There are two things here, as far as I understand:
> 1) This is a valid Windows path.  So shouldn't we fix pg_upgrade's
> server.c to be a bit more compliant with Windows paths?  The code
> accepts only paths beginning with '/' as local paths, so this breaks.

It also doesn't handle @ correctly. Makes sense to fix. Should probably use
the same logic that libpq, psql, ... use?

if (is_unixsock_path(ch->host))
ch->type = CHT_UNIX_SOCKET;

that'd basically be the same amount of code. And easier to understand.

Greetings,

Andres Freund




Re: Mark all GUC variable as PGDLLIMPORT

2022-02-12 Thread Andres Freund
Hi,


On 2021-08-23 14:53:34 +0800, Julien Rouhaud wrote:
> So since the non currently explicitly exported GUC global variables shouldn't
> be accessible by third-party code, I'm attaching a POC patch that does the
> opposite of v1: enforce that restriction using a new pg_attribute_hidden()
> macro, defined with GCC only, to start discussing that topic.

This fails on cfbot: https://cirrus-ci.com/task/6424663592009728?logs=build#L1


I'm not feeling a lot of enthusiasm for the change. But if we were to do this,
we'd have to have infrastructure to detect missing hidden declarations,
otherwise it's inevitable that they don't get added.

I kind of like the idea of hiding postgres symbols for other reasons than
win<->everything else parity. Namely that not exporting symbols can allow the
compiler to optimize more...

Greetings,

Andres Freund




Re: Fix overflow in DecodeInterval

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 21:16:05 -0500, Joseph Koshakow wrote:
> I've attached the patch below.

Any reason for using int return types?


> +/* As above, but initial val produces years */
> +static int
> +AdjustYears(int val, struct pg_tm *tm, int multiplier)
> +{
> + int years;
> + if (pg_mul_s32_overflow(val, multiplier, ))
> + return 1;
> + if (pg_add_s32_overflow(tm->tm_year, years, >tm_year))
> + return 1;
> + return 0;
>  }

particularly since the pg_*_overflow stuff uses bool?


Greetings,

Andres Freund




Re: Adding CI to our tree

2022-02-12 Thread Andres Freund
On 2022-02-12 20:47:04 -0600, Justin Pryzby wrote:
> While rebasing, I noticed an error.
> 
> You wrote **/.diffs, but should be **/*.diffs

Embarassing. Thanks for noticing! Pushed the fix...




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-12 Thread Andres Freund
On 2022-02-13 12:36:13 +0900, Masahiko Sawada wrote:
> Actually, I'm working on simplifying and improving radix tree
> implementation for PG16 dev cycle. From the discussion so far I think
> it's better to have a data structure that can be used for
> general-purpose and is also good for storing TID, not very specific to
> store TID. So I think radix tree would be a potent candidate. I have
> done the insertion and search implementation.

Awesome!




using file cloning in create database / initdb

2022-02-12 Thread Andres Freund
Hi,

This thread started at 
https://www.postgresql.org/message-id/20220213021746.GM31460%40telsasoft.com
but is mostly independent, so I split the thread off

On 2022-02-12 20:17:46 -0600, Justin Pryzby wrote:
> On Sat, Feb 12, 2022 at 06:00:44PM -0800, Andres Freund wrote:
> > I bet using COW file copies would speed up our own regression tests 
> > noticeably
> > - on slower systems we spend a fair bit of time and space creating template0
> > and postgres, with the bulk of the data never changing.
> > 
> > Template databases are also fairly commonly used by application developers 
> > to
> > avoid the cost of rerunning all the setup DDL & initial data loading for
> > different tests. Making that measurably cheaper would be a significant win.
> 
> +1
> 
> I ran into this last week and was still thinking about proposing it.
> 
> Would this help CI

It could theoretically help linux - but currently I think the filesystem for
CI is ext4, which doesn't support FICLONE. I assume it'd help macos, but I
don't know the performance characteristics of copyfile(). I don't think any of
the other OSs have working reflink / file clone support.

You could prototype it for CI on macos by using the "template initdb" patch
and passing -c to cp.

On linux it might be worth using copy_file_range(), if supported, if not file
cloning. But that's kind of an even more separate topic...


>  or any significant fraction of buildfarm ?

Not sure how many are on new enough linux / mac to benefit and use a suitable
filesystem. There are a few animals with slow-ish storage but running fairly
new linux. Don't think we can see the FS. Those would likely benefit the most.


> Or just tests run locally on supporting filesystems.

Probably depends on your storage subsystem. If not that fast, and running
tests concurrently, it'd likely help.


On my workstation, with lots of cores and very fast storage, using the initdb
caching patch modified to do cp --reflink=never / always yields the following
time for concurrent check-world (-j40 PROVE_FLAGS=-j4):

cp --reflink=never:

96.64user 61.74system 1:04.69elapsed 244%CPU (0avgtext+0avgdata 
97544maxresident)k
0inputs+34124296outputs (2584major+7247038minor)pagefaults 0swaps
pcheck-world-success

cp --reflink=always:

91.79user 56.16system 1:04.21elapsed 230%CPU (0avgtext+0avgdata 
97716maxresident)k
189328inputs+16361720outputs (2674major+7229696minor)pagefaults 0swaps
pcheck-world-success

Seems roughly stable across three runs.


Just comparing the time for cp -r of a fresh initdb'd cluster:
cp -a --reflink=never
real0m0.043s
user0m0.000s
sys 0m0.043s
cp -a --reflink=always
real0m0.021s
user0m0.004s
sys 0m0.018s

so that's a pretty nice win.


> Note that pg_upgrade already supports copy/link/clone.  (Obviously, link
> wouldn't do anything desirable for CREATE DATABASE).

Yea. We'd likely have to move relevant code into src/port.


Greetings,

Andres Freund




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-12 Thread Masahiko Sawada
On Sun, Feb 13, 2022 at 11:02 AM Andres Freund  wrote:
>
> Hi,
>
> On 2022-02-11 13:47:01 +0100, Matthias van de Meent wrote:
> > Today I noticed the inefficiencies of our dead tuple storage once
> > again, and started theorizing about a better storage method; which is
> > when I remembered that this thread exists, and that this thread
> > already has amazing results.
> >
> > Are there any plans to get the results of this thread from PoC to 
> > committable?
>
> I'm not currently planning to work on it personally. It'd would be awesome if
> somebody did...

Actually, I'm working on simplifying and improving radix tree
implementation for PG16 dev cycle. From the discussion so far I think
it's better to have a data structure that can be used for
general-purpose and is also good for storing TID, not very specific to
store TID. So I think radix tree would be a potent candidate. I have
done the insertion and search implementation.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-12 Thread Michael Paquier
On Sat, Feb 12, 2022 at 05:16:03PM -0500, Tom Lane wrote:
> It looks to me like somebody figured it didn't need any more support
> than gzip/bzip2, which is wrong on a couple of grounds:
> * hardly any modern platforms lack those, unlike lz4
> * we don't invoke either one of them during testing, only when
>   you explicitly ask to make a compressed tarball
> 
> I think adding an explicit PGAC_PATH_PROGS would be worth the cycles.

Well, this somebody is I, and the buildfarm did not blow up on any of
that so that looked rather fine.  Adding a few cycles for this check
is fine by me.  What do you think of the attached?
--
Michael
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index 0e6e685aa6..88dff2e673 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -130,8 +130,7 @@ SKIP:
 	my $gzip = $ENV{GZIP_PROGRAM};
 	skip "program gzip is not found in your system", 1
 	  if ( !defined $gzip
-		|| $gzip eq ''
-		|| system_log($gzip, '--version') != 0);
+		|| $gzip eq '' );
 
 	my $gzip_is_valid = system_log($gzip, '--test', @zlib_wals);
 	is($gzip_is_valid, 0,
@@ -186,8 +185,7 @@ SKIP:
 	my $lz4 = $ENV{LZ4};
 	skip "program lz4 is not found in your system", 1
 	  if ( !defined $lz4
-		|| $lz4 eq ''
-		|| system_log($lz4, '--version') != 0);
+		|| $lz4 eq '' );
 
 	my $lz4_is_valid = system_log($lz4, '-t', @lz4_wals);
 	is($lz4_is_valid, 0,
diff --git a/configure b/configure
index 0d52af5529..930658 100755
--- a/configure
+++ b/configure
@@ -650,6 +650,7 @@ CFLAGS_ARMV8_CRC32C
 CFLAGS_SSE42
 have_win32_dbghelp
 LIBOBJS
+LZ4
 UUID_LIBS
 LDAP_LIBS_BE
 LDAP_LIBS_FE
@@ -13832,6 +13833,60 @@ fi
 
 fi
 
+if test -z "$LZ4"; then
+  for ac_prog in lz4
+do
+  # Extract the first word of "$ac_prog", so it can be a program name with args.
+set dummy $ac_prog; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_LZ4+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $LZ4 in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_LZ4="$LZ4" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_path_LZ4="$as_dir/$ac_word$ac_exec_ext"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+LZ4=$ac_cv_path_LZ4
+if test -n "$LZ4"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $LZ4" >&5
+$as_echo "$LZ4" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+  test -n "$LZ4" && break
+done
+
+else
+  # Report the value of LZ4 in configure's output in all cases.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for LZ4" >&5
+$as_echo_n "checking for LZ4... " >&6; }
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $LZ4" >&5
+$as_echo "$LZ4" >&6; }
+fi
+
 if test "$with_lz4" = yes; then
   for ac_header in lz4.h
 do :
diff --git a/configure.ac b/configure.ac
index 2afc822b12..16167329fc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1485,6 +1485,7 @@ failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zlib to disable zlib support.])])
 fi
 
+PGAC_PATH_PROGS(LZ4, lz4)
 if test "$with_lz4" = yes; then
   AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])])
 fi
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 05c54b27de..9dcd54fcbd 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -350,7 +350,7 @@ XGETTEXT = @XGETTEXT@
 
 GZIP	= gzip
 BZIP2	= bzip2
-LZ4	= lz4
+LZ4	= @LZ4@
 
 DOWNLOAD = wget -O $@ --no-use-server-timestamps
 #DOWNLOAD = curl -o $@


signature.asc
Description: PGP signature


Re: Adding CI to our tree

2022-02-12 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 04:24:20PM -0800, Andres Freund wrote:
> > > e5286ede1b4 cirrus: avoid unnecessary double star **
> > 
> > Can't get excited about this, but whatever.
> > 
> > What I am excited about is that some of your other changes showed that we
> > don't need separate *_artifacts for separate directories anymore. That used 
> > to
> > be the case, but an array of paths is now supported. Putting log, diffs, and
> > regress_log in one directory will be considerably more convenient...
> 
> pushed together.

While rebasing, I noticed an error.

You wrote **/.diffs, but should be **/*.diffs

--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -30,15 +30,11 @@ env:
 # What files to preserve in case tests fail
 on_failure: _failure
   log_artifacts:
-path: "**/**.log"
+paths:
+  - "**/*.log"
+  - "**/.diffs"
+  - "**/regress_log_*"
 type: text/plain
-  regress_diffs_artifacts:
-path: "**/**.diffs"
-type: text/plain
-  tap_artifacts:
-path: "**/regress_log_*"
-type: text/plain
-




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-12 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 06:00:44PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2022-02-11 16:19:12 -0500, Robert Haas wrote:
> > I somewhat hope we never end up with THREE strategies for creating a new
> > database, but now that I think about it, we might. Somebody might want to
> > use a fancy FS primitive that clones a directory at the FS level, or
> > something.
> 
> I think that'd be a great, and pretty easy to implement, feature. But it seems
> like it'd be mostly orthogonal to the "WAL log data" vs "checkpoint data"
> question? On the primary / single node system using "WAL log data" with "COW
> file copy" would work well.
> 
> I bet using COW file copies would speed up our own regression tests noticeably
> - on slower systems we spend a fair bit of time and space creating template0
> and postgres, with the bulk of the data never changing.
> 
> Template databases are also fairly commonly used by application developers to
> avoid the cost of rerunning all the setup DDL & initial data loading for
> different tests. Making that measurably cheaper would be a significant win.

+1

I ran into this last week and was still thinking about proposing it.

Would this help CI or any significant fraction of buildfarm ?
Or just tests run locally on supporting filesystems.

Note that pg_upgrade already supports copy/link/clone.  (Obviously, link
wouldn't do anything desirable for CREATE DATABASE).

-- 
Justin




Re: Fix overflow in DecodeInterval

2022-02-12 Thread Joseph Koshakow
On Fri, Feb 11, 2022 at 4:58 PM Joseph Koshakow  wrote:
>
> Tom Lane  writes:
> > Joseph Koshakow  writes:
> > > The attached patch fixes an overflow bug in DecodeInterval when applying
> > > the units week, decade, century, and millennium. The overflow check logic
> > > was modelled after the overflow check at the beginning of `int
> > > tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in 
> > > timestamp.c.
> >
> >
> > Good catch, but I don't think that tm2interval code is best practice
> > anymore.  Rather than bringing "double" arithmetic into the mix,
> > you should use the overflow-detecting arithmetic functions in
> > src/include/common/int.h.  The existing code here is also pretty
> > faulty in that it doesn't notice addition overflow when combining
> > multiple units.  So for example, instead of
> >
> >
> > tm->tm_mday += val * 7;
> >
> >
> > I think we should write something like
> >
> >
> > if (pg_mul_s32_overflow(val, 7, ))
> > return DTERR_FIELD_OVERFLOW;
> > if (pg_add_s32_overflow(tm->tm_mday, tmp, >tm_mday))
> > return DTERR_FIELD_OVERFLOW;
> >
> >
> > Perhaps some macros could be used to make this more legible?
> >
> >
> > regards, tom lane
> >
> >
> > @postgresql
>
> Thanks for the feedback Tom, I've attached an updated patch with
> your suggestions. Feel free to rename the horribly named macro.
>
> Also while fixing this I noticed that fractional intervals can also
> cause an overflow issue.
> postgres=# SELECT INTERVAL '0.1 months 2147483647 days';
>  interval
> --
>  -2147483646 days
> (1 row)
> I haven't looked into it, but it's probably a similar cause.

Hey Tom,

I was originally going to fix the fractional issue in a follow-up
patch, but I took a look at it and decided to make a new patch
with both fixes. I tried to make the handling of fractional and
non-fractional units consistent. I've attached the patch below.

While investigating this, I've noticed a couple more overflow
issues with Intervals, but I think they're best handled in separate
patches. I've included the ones I've found in this email.

  postgres=# CREATE TEMP TABLE INTERVAL_TBL_OF (f1 interval);
  CREATE TABLE
  postgres=# INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('0.1 weeks
2147483647 hrs');
  INSERT 0 1
  postgres=# SELECT * FROM INTERVAL_TBL_OF;
  ERROR:  interval out of range

  postgres=# SELECT justify_days(INTERVAL '2147483647 months 2147483647 days');
 justify_days
  ---
   -172991738 years -4 mons -23 days
  (1 row)

Cheers,
Joe Koshakow
From de041874b73600312e7ce71931c297f0aad3aa06 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 103 +++--
 src/test/regress/expected/interval.out |  56 ++
 src/test/regress/sql/interval.sql  |  15 
 3 files changed, 152 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..f89cc7c29b 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -44,10 +45,13 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static int AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static int AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static int AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool 

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-11 13:47:01 +0100, Matthias van de Meent wrote:
> Today I noticed the inefficiencies of our dead tuple storage once
> again, and started theorizing about a better storage method; which is
> when I remembered that this thread exists, and that this thread
> already has amazing results.
> 
> Are there any plans to get the results of this thread from PoC to committable?

I'm not currently planning to work on it personally. It'd would be awesome if
somebody did...

Greetings,

Andres Freund




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-11 16:19:12 -0500, Robert Haas wrote:
> I somewhat hope we never end up with THREE strategies for creating a new
> database, but now that I think about it, we might. Somebody might want to
> use a fancy FS primitive that clones a directory at the FS level, or
> something.

I think that'd be a great, and pretty easy to implement, feature. But it seems
like it'd be mostly orthogonal to the "WAL log data" vs "checkpoint data"
question? On the primary / single node system using "WAL log data" with "COW
file copy" would work well.

I bet using COW file copies would speed up our own regression tests noticeably
- on slower systems we spend a fair bit of time and space creating template0
and postgres, with the bulk of the data never changing.

Template databases are also fairly commonly used by application developers to
avoid the cost of rerunning all the setup DDL & initial data loading for
different tests. Making that measurably cheaper would be a significant win.

Greetings,

Andres Freund




Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 11:47:20 -0500, Tom Lane wrote:
> Alexander Lakhin  writes:
> > 11.02.2022 05:22, Andres Freund wrote:
> >> Over in another thread I made some wild unsubstantiated guesses that the
> >> windows issues could have been made much more likely by a somewhat odd bit 
> >> of
> >> code in PQisBusy():
> >> https://postgr.es/m/1959196.1644544971%40sss.pgh.pa.us
> >> Alexander, any chance you'd try if that changes the likelihood of the 
> >> problem
> >> occurring, without any other fixes / reverts applied?
> 
> > Unfortunately I haven't seen an improvement for the test in question.

Thanks for testing!


> Yeah, that's what I expected, sadly.  While I think this PQisBusy behavior
> is definitely a bug, it will not lead to an infinite loop, just to write
> failures being reported in a less convenient fashion than intended.

FWIW, I didn't think it'd end up looping indefinitely, but that there's a
chance it could end up waiting indefinitely. The WaitLatchOrSocket() doesn't
have a timeout, and if I understand the windows FD_CLOSE stuff correctly,
you're not guaranteed to get an event if you do WaitForMultipleObjects if
FD_CLOSE was already consumed and if there isn't any data to read.


ISTM that it's not a great idea for libpqrcv_receive() to do blocking IO at
all. The caller expects it to not block...


> I wonder whether it would help to put a PQconsumeInput call *before*
> the PQisBusy loop, so that any pre-existing EOF condition will be
> detected.  If you don't like duplicating code, we could restructure
> the loop as

That does look a bit saner. Even leaving EOF and windows issues aside, it
seems weird to do a WaitLatchOrSocket() without having tried to read more
data.

Greetings,

Andres Freund




Re: Adding CI to our tree

2022-02-12 Thread Andres Freund
Hi,

Alvaro adding you because the "branch" discussion in the MERGE thread.


On 2022-02-03 23:04:04 -0600, Justin Pryzby wrote:
> > I'm a bit worried about the increased storage and runtime overhead due to 
> > the
> > docs changes. We probably can make it a good bit cheaper though.
>
> If you mean overhead of additional disk operations, it shouldn't be an issue,
> since the git clone uses shared references (not even hardlinks).

I was concerned about those and just the increased runtime of the script. Our
sources are 130MB, leaving the shared .git aside. But maybe it's just fine.

We probably can just get rid of the separate clone and configure run though?
Build the docs, copy the output, do a git co parent docs/, build again?


What was the reason behind moving the docs stuff from the compiler warnings
task to linux? Not that either fits very well... I think it might be worth
moving the docs stuff into its own task, using a 1 CPU container (docs build
isn't parallel anyway). Think that'll be easier to see in the cfbot page. I
think it's also good to run it independent of the linux task succeeding - a
docs failure seems like a separate enough issue.


> > > 9d0f03d3450 wip: cirrus: code coverage
> >
> > I don't think it's good to just unconditionally reference the master branch
> > here. It'll do bogus stuff once 15 is branched off. It works for cfbot, but
> > not other uses.
>
> That's only used for filtering changed files.  It uses git diff --cherry-pick
> postgres/master..., which would *try* to avoid showing anything which is also
> in master.

You commented in another email on this:

On 2022-02-11 12:51:50 -0600, Justin Pryzby wrote:
> Because I put your patch on top of some other branch with the CI coverage (and
> other stuff).
>
> It tries to only show files changed by the branch being checked:
> https://github.com/justinpryzby/postgres/commit/d668142040031915
>
> But it has to figure out where the branch "starts".  Which I did by looking at
> "git diff --cherry-pick origin..."
>
> Andres thinks that does the wrong thing if CI is run manually (not by CFBOT)
> for patches against backbranches.  I'm not sure git diff --cherry-pick is
> widely known/used, but I think using that relative to master may be good
> enough.

Note that I'm not concerned about "manually" running CI against other branches
- I'm concerned about the point where where 15 branches off and CI will
automatically also run against 15. E.g. in the postgres repo
https://cirrus-ci.com/github/postgres/postgres/

I can see a few ways to deal with this:
1) iterate over release branches and see which has the smallest diff
2) parse the branch name, if it's a cfbot run, we know it's master, otherwise 
skip
3) change cfbot so that it maintains things as pull requests, which have a
   base branch


> > Is looking at the .c files in the change really a reliable predictor of 
> > where
> > code coverage changes? I'm doubtful. Consider stuff like removing the last
> > user of some infrastructure or such. Or adding the first.
>
> You're right that it isn't particularly accurate, but it's a step forward if
> lots of patches use it to check/improve coverge of new code.

Maybe it's good enough... The overhead in test runtime is noticeable (~5.30m
-> ~7.15m), but probably acceptable. Although I also would like to enable
-fsanitize=alignment and -fsanitize=alignment, which add about 2 minutes as
well (-fsanitize=address is a lot more expensive), they both work best on
linux.


> In addition to the HTML generated for changed .c files, it's possible to 
> create
> a coverage.gcov output for everything, which could be retrieved to generate
> full HTML locally.  It's ~8MB (or 2MB after gzip).

Note sure that doing doing it locally adds much over just running tests
locally.

Greetings,

Andres Freund




Re: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 1173, ...

2022-02-12 Thread Andres Freund
Hi, 

On February 12, 2022 3:57:28 PM PST, Thomas Munro  
wrote:
>Hi,
>
>I don't know what happened here, but I my little script that scrapes
>for build farm assertion failures hasn't seen this one before now.
>It's on a 15 year old macOS release and compiled with 15 year old GCC
>version, for some added mystery.
>
>https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust=2022-02-12%2022%3A07%3A21
>
>2022-02-13 00:09:52.859 CET [92423:92] pg_regress/replorigin
>STATEMENT:  SELECT data FROM
>pg_logical_slot_get_changes('regression_slot', NULL, NULL,
>'include-xids', '0', 'skip-empty-xacts', '1', 'include-sequences',
>'0');
>TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File:
>"reorderbuffer.c", Line: 1173, PID: 92423)
>0   postgres0x00592dd0 ExceptionalCondition + 
>160\\0
>1   postgres0x00391c40
>ReorderBufferGetRelids + 304\\0
>2   postgres0x00394794
>ReorderBufferAllocate + 1044\\0
>3   postgres0x003834dc heap_decode + 76\\0
>4   postgres0x003829c8
>LogicalDecodingProcessRecord + 168\\0
>5   postgres0x0038a1d4
>pg_logical_emit_message_text + 1876\\0
>6   postgres0x00241810
>ExecMakeTableFunctionResult + 576\\0
>7   postgres0x002576c4
>ExecInitFunctionScan + 1444\\0
>8   postgres0x00242140 ExecScan + 896\\0
>9   postgres0x002397cc standard_ExecutorRun + 
>540\\0
>10  postgres0x00427940
>EnsurePortalSnapshotExists + 832\\0
>11  postgres0x00428e84 PortalRun + 644\\0
>12  postgres0x004252e0 pg_plan_queries + 3232\\0
>13  postgres0x00426270 PostgresMain + 3424\\0
>14  postgres0x0036b9ec PostmasterMain + 8060\\0
>15  postgres0x0029f24c main + 1356\\0
>16  postgres0x2524 start + 68\\0

I think that may be the bug that Tomas fixed a short while ago. He said the 
skip empty xacts stuff didn't yet work for sequences. That's only likely to be 
hit in slower machines...

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: Adding CI to our tree

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-03 11:57:18 -0800, Andres Freund wrote:
> > 95793675633 cirrus: spell ccache_maxsize
> 
> Yep, will apply with a bunch of your other changes, if you answer a question
> or two...

Pushed.


> > e5286ede1b4 cirrus: avoid unnecessary double star **
> 
> Can't get excited about this, but whatever.
> 
> What I am excited about is that some of your other changes showed that we
> don't need separate *_artifacts for separate directories anymore. That used to
> be the case, but an array of paths is now supported. Putting log, diffs, and
> regress_log in one directory will be considerably more convenient...

pushed together.


> > 398b7342349 cirrus/macos: uname -a and export at end of sysinfo
> 
> Shrug.

Pushed.

Greetings,

Andres Freund




FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 1173, ...

2022-02-12 Thread Thomas Munro
Hi,

I don't know what happened here, but I my little script that scrapes
for build farm assertion failures hasn't seen this one before now.
It's on a 15 year old macOS release and compiled with 15 year old GCC
version, for some added mystery.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust=2022-02-12%2022%3A07%3A21

2022-02-13 00:09:52.859 CET [92423:92] pg_regress/replorigin
STATEMENT:  SELECT data FROM
pg_logical_slot_get_changes('regression_slot', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1', 'include-sequences',
'0');
TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File:
"reorderbuffer.c", Line: 1173, PID: 92423)
0   postgres0x00592dd0 ExceptionalCondition + 160\\0
1   postgres0x00391c40
ReorderBufferGetRelids + 304\\0
2   postgres0x00394794
ReorderBufferAllocate + 1044\\0
3   postgres0x003834dc heap_decode + 76\\0
4   postgres0x003829c8
LogicalDecodingProcessRecord + 168\\0
5   postgres0x0038a1d4
pg_logical_emit_message_text + 1876\\0
6   postgres0x00241810
ExecMakeTableFunctionResult + 576\\0
7   postgres0x002576c4
ExecInitFunctionScan + 1444\\0
8   postgres0x00242140 ExecScan + 896\\0
9   postgres0x002397cc standard_ExecutorRun + 540\\0
10  postgres0x00427940
EnsurePortalSnapshotExists + 832\\0
11  postgres0x00428e84 PortalRun + 644\\0
12  postgres0x004252e0 pg_plan_queries + 3232\\0
13  postgres0x00426270 PostgresMain + 3424\\0
14  postgres0x0036b9ec PostmasterMain + 8060\\0
15  postgres0x0029f24c main + 1356\\0
16  postgres0x2524 start + 68\\0




Re: Adding CI to our tree

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 16:06:40 -0600, Justin Pryzby wrote:
> I had some success with that, but it doesn't seem to be significantly faster -
> it looks a lot like the tests are not actually running in parallel.

Note that prove unfortunately serializes the test output to be in the order it
started them, even when tests run concurrently. Extremely unhelpful, but ...

Isn't this kind of a good test time? I thought earlier your alltaptests target
took a good bit longer?

One nice bit is that the output is a *lot* easier to read.


You could try improving the total time by having prove remember slow tests and
use that file to run the slowest tests first next time. --state slow,save or
such I believe. Of course we'd have to save that state file...

To verify that tests actually run concurrently you could emit a few
notes. Looks like those are output immediately...

Greetings,

Andres Freund




Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-12 Thread Tom Lane
Robert Haas  writes:
> On Sat, Feb 12, 2022 at 5:09 PM Andres Freund  wrote:
>> I don't think there's an actual configure check for the lz4 binary? Looks 
>> like
>> a static assignment in src/Makefile.global.in to me.

> Oh. That seems kind of dumb.

It looks to me like somebody figured it didn't need any more support
than gzip/bzip2, which is wrong on a couple of grounds:
* hardly any modern platforms lack those, unlike lz4
* we don't invoke either one of them during testing, only when
  you explicitly ask to make a compressed tarball

I think adding an explicit PGAC_PATH_PROGS would be worth the cycles.

regards, tom lane




Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-12 Thread Robert Haas
On Sat, Feb 12, 2022 at 5:09 PM Andres Freund  wrote:
> On 2022-02-12 07:24:46 -0500, Robert Haas wrote:
> > You may be right, but I'm not entirely convinced. $ENV{'LZ4'} here is
> > being set by make, and make is setting it to whatever configure found.
> > If configure found a version of the lz4 program that doesn't actually
> > work, isn't that configure's fault, or the system administrator's
> > fault, rather than this test's fault?
>
> I don't think there's an actual configure check for the lz4 binary? Looks like
> a static assignment in src/Makefile.global.in to me.

Oh. That seems kind of dumb.

It does mean that trying to run it is all that we can do, though,
because we don't have a full path.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-12 Thread Andres Freund
On 2022-02-12 07:24:46 -0500, Robert Haas wrote:
> You may be right, but I'm not entirely convinced. $ENV{'LZ4'} here is
> being set by make, and make is setting it to whatever configure found.
> If configure found a version of the lz4 program that doesn't actually
> work, isn't that configure's fault, or the system administrator's
> fault, rather than this test's fault?

I don't think there's an actual configure check for the lz4 binary? Looks like
a static assignment in src/Makefile.global.in to me.




Re: Adding CI to our tree

2022-02-12 Thread Justin Pryzby
On Tue, Jan 18, 2022 at 05:16:26PM -0800, Andres Freund wrote:
> On 2022-01-18 15:08:47 -0600, Justin Pryzby wrote:
> > On Mon, Jan 17, 2022 at 12:16:19PM -0800, Andres Freund wrote:
> > > I think it might still be worth adding stopgap way of running all tap 
> > > tests on
> > > windows though. Having a vcregress.pl function to find all directories 
> > > with t/
> > > and run the tests there, shouldn't be a lot of code...
> > 
> > I started doing that, however it makes CI/windows even slower.
...
> > I think it'll be necessary to run prove with all the tap tests to
> > parallelize them, rather than looping around directories, many of which have
> > only a single file, and are run serially.
> 
> That's unfortunately not trivially possible. Quite a few tests currently rely
> on being called in a specific directory. We should fix this, but it's not a
> trivial amount of work.

On Sat, Feb 05, 2022 at 07:23:39PM -0800, Andres Freund wrote:
> On 2022-02-03 23:04:04 -0600, Justin Pryzby wrote:
> > > I assume this doesn't yet work to a meaningful degree? Last time I checked
> > > there were quite a few tests that needed to be invoked in a specific
> > > directory.
> > 
> > It works - tap_check() does chdir().
> 
> Ah, I thought you'd implemented a target that does it all in one prove
> invocation...

I had some success with that, but it doesn't seem to be significantly faster -
it looks a lot like the tests are not actually running in parallel.  I tried
some variations like passing the list of dirs vs the list of files, and
--jobs=9 vs -j9, without success.

https://cirrus-ci.com/task/5580584675180544

https://github.com/justinpryzby/postgres/commit/a865adc5b8c
fc7b3ea8bce vcregress/ci: test modules/contrib with NO_INSTALLCHECK=1
03adb043d16 wip: vcsregress: add alltaptests
63bf0796ffd wip: vcregress: run alltaptests in parallel
9dc327f6b30 f!wip: vcregress: run alltaptests in a single prove invocation
a865adc5b8c tmp: run tap tests first

> > It currently fails in 027_stream_regress.pl, although I keep hoping that it
> > had been fixed...
> 
> That's likely because you're not setting REGRESS_OUTPUTDIR like
> src/test/recovery/Makefile and recoverycheck() are doing.

Yes, thanks.

-- 
Justin




Re: buildfarm warnings

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 16:42:03 -0500, Tom Lane wrote:
> Another new one that maybe should be silenced is
>
> /mnt/resource/bf/build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c:1129:41:
>  warning: 'freespace' may be used uninitialized in this function 
> [-Wmaybe-uninitialized]
>
> Only skink and frogfish are showing that, though.

skink uses -O1 to reduce the overhead of valgrind a bit (iirc that shaved off
a couple of hours) without making backtraces completely unreadable. Which
explains why it shows different warnings than most other animals...

Greetings,

Andres Freund




Re: Race condition in TransactionIdIsInProgress

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 13:25:58 +, Simon Riggs wrote:
> On Fri, 11 Feb 2022 at 19:08, Andres Freund  wrote:
> 
> > On 2022-02-11 13:48:59 +, Simon Riggs wrote:
> > > On Fri, 11 Feb 2022 at 08:48, Simon Riggs  
> > > wrote:
> > > >
> > > > On Fri, 11 Feb 2022 at 06:11, Andres Freund  wrote:
> > > >
> > > > > Looks lik syncrep will make this a lot worse, because it can 
> > > > > drastically
> > > > > increase the window between the TransactionIdCommitTree() and
> > > > > ProcArrayEndTransaction() due to the SyncRepWaitForLSN() inbetween.  
> > > > > But at
> > > > > least it might make it easier to write tests exercising this 
> > > > > scenario...
> > > >
> > > > Agreed
> > > >
> > > > TransactionIdIsKnownCompleted(xid) is only broken because the single
> > > > item cache is set too early in some cases. The single item cache is
> > > > important for performance, so we just need to be more careful about
> > > > setting the cache.
> > >
> > > Something like this... fix_cachedFetchXid.v1.patch prevents the cache
> > > being set, but this fails! Not worked out why, yet.
> >
> > I don't think it's safe to check !TransactionIdKnownNotInProgress() in
> > TransactionLogFetch(), given that TransactionIdKnownNotInProgress() needs to
> > do TransactionLogFetch() internally.
> 
> That's not correct because you're confusing
> TransactionIdKnownNotInProgress(), which is a new function introduced
> by the patch, with the existing function
> TransactionIdIsKnownCompleted().

I don't think so. This call of the new TransactionIdKnownNotInProgress()

> --- a/src/backend/access/transam/transam.c
> +++ b/src/backend/access/transam/transam.c
> @@ -73,6 +73,14 @@ TransactionLogFetch(TransactionId transactionId)
>   return TRANSACTION_STATUS_ABORTED;
>   }
>  
> + /*
> +  * Safeguard that we have called TransactionIsIdInProgress() before
> +  * checking commit log manager, to ensure that we do not cache the
> +  * result until the xid is no longer in the procarray at eoxact.
> +  */
> + if (!TransactionIdKnownNotInProgress(transactionId))
> + return TRANSACTION_STATUS_IN_PROGRESS;
> +
>   /*
>* Get the transaction status.
>*/

isn't right. Consider what happens to TransactionIdIsInProgress(x)'s call to
TransactionIdDidAbort(x) at "step 4". TransactionIdDidAbort(x) will call
TransactionLogFetch(x). cachedXidIsNotInProgress isn't yet set to x, so
TransactionLogFetch() returns TRANSACTION_STATUS_IN_PROGRESS. Even if the
(sub)transaction aborted.

Which explains why your first patch doesn't work.


> > > just_remove_TransactionIdIsKnownCompleted_call.v1.patch
> >
> > I think this might contain a different diff than what the name suggests?
> 
> Not at all, please don't be confused by the name. The patch removes
> the call to TransactionIdIsKnownCompleted() from
> TransactionIdIsInProgress().

I guess for me "just remove" doesn't include adding a new cache...


> I'm not sure it is possible to remove TransactionIdIsKnownCompleted()
> in backbranches.

I think it'd be fine if we needed to. Or we could just make it always return
false or such.


> > > just removes the known offending call, passes make check, but IMHO
> > > leaves the same error just as likely by other callers.
> >
> > Which other callers are you referring to?
> 
> I don't know of any, but we can't just remove a function in a
> backbranch, can we?

We have in the past, if it's a "sufficiently internal" function.


Greetings,

Andres Freund




Re: buildfarm warnings

2022-02-12 Thread Tom Lane
Justin Pryzby  writes:
> Is there any check for warnings from new code, other than those buildfarm
> members with -Werror ?

I periodically grep the buildfarm logs for interesting warnings.
There are a lot of uninteresting ones :-(, so I'm not sure how
automatable that'd be.  I do use a prefiltering script, which
right now says there are
13917 total warnings from 41 buildfarm members
of which it thinks all but 350 are of no interest.  Most of those
350 are of no interest either, but I didn't bother to make
filter rules for them yet ...

> It'd be better to avoid warnings, allowing members to use -Werror, rather than
> to allow/ignore warnings, which preclude that possibility.

The ones I classify as "uninteresting" are mostly from old/broken
compilers.  I don't think it'd be profitable to try to convince
prairiedog's 17-year-old compiler that we don't have uninitialized
variables, for instance.

> I'm of the impression that some people have sql access to BF logs.

Yeah.  I'm not sure exactly what the access policy is for that machine;
I know we'll give out logins to committers, but not whether there's
any precedent for non-committers.

> pg_basebackup.c:1261:35: warning: storing the address of local variable 
> archive_filename in progress_filename [-Wdangling-pointer=]
>   => new in 23a1c6578 - looks like a real error

I saw that one a few days ago but didn't get around to looking
more closely yet.  It does look fishy, but it might be okay
depending on when the global variable can be accessed.

Another new one that maybe should be silenced is

/mnt/resource/bf/build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c:1129:41:
 warning: 'freespace' may be used uninitialized in this function 
[-Wmaybe-uninitialized]

Only skink and frogfish are showing that, though.

regards, tom lane




Re: refactoring basebackup.c

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 15:12:21 -0600, Justin Pryzby wrote:
> I think they would've been visible in the CI environment, too.

Yea, but only if you looked carefully enough. The postgres github repo has CI
enabled, and it's green. But the windows build step does show the warnings:

https://cirrus-ci.com/task/6185407539838976?logs=build#L2066
https://cirrus-ci.com/github/postgres/postgres/

[19:08:09.086] c:\cirrus\src\backend\replication\basebackup_lz4.c(87): warning 
C4715: 'bbsink_lz4_new': not all control paths return a value 
[c:\cirrus\postgres.vcxproj]

Probably worth scripting something to make the windows task error out if there
had been warnings, but only after running the tests.

Greetings,

Andres Freund




buildfarm warnings

2022-02-12 Thread Justin Pryzby
Is there any check for warnings from new code, other than those buildfarm
members with -Werror ?

It'd be better to avoid warnings, allowing members to use -Werror, rather than
to allow/ignore warnings, which preclude that possibility.  A circular problem.

I checked for warnings on master during "check" phase by opening many browser
tabs...

I'm of the impression that some people have sql access to BF logs.  It seems
like it'd be easy to make a list of known/old/false-positive warnings and
ignore warnings matching (member,file,regex) to detect new warnings.

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=caiman=2022-02-12%2006%3A00%3A30=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=moonjelly=2022-02-12%2001%3A06%3A22=make
rawhide prerelease / gcc trunk
pg_basebackup.c:1261:35: warning: storing the address of local variable 
archive_filename in progress_filename [-Wdangling-pointer=]
=> new in 23a1c6578 - looks like a real error

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=wrasse=2022-02-12%2005%3A08%3A48=make
Oracle Developer Studio
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/libpq/auth.c",
 line 104: warning: initialization type mismatch
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/commands/copy.c",
 line 206: warning: true part of ?: is unused and is nonconstant
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/utils/fmgr/fmgr.c",
 line 400: warning: assignment type mismatch:
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/access/common/reloptions.c",
 line 742: warning: argument #2 is incompatible with prototype:
=> "Size relopt_struct_size" looks wrong since 911e7020 (v13)

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=haddock=2022-02-12%2000%3A23%3A17=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hake=2022-02-12%2003%3A00%3A35=make
llumos "rolling release"
pg_dump_sort.c:1232:3: warning: format not a string literal and no format 
arguments [-Wformat-security]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=prairiedog=2022-02-12%2009%3A52%3A07=make
OSX/PPC
A lot of these, but some from new code (ba59)
pg_receivewal.c:288: warning: 'wal_compression_method' may be used 
uninitialized in this function
pg_receivewal.c:289: warning: 'ispartial' may be used uninitialized in this 
function

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gharial=2022-02-12%2012%3A36%3A42=make
HPUX
compress_io.c:532:4: warning: implicit declaration of function gzopen64 
[-Wimplicit-function-declaration]
pg_backup_archiver.c:1521:4: warning: implicit declaration of function gzopen64 
[-Wimplicit-function-declaration]
pg_backup_archiver.c:1521:11: warning: assignment makes pointer from integer 
without a cast [enabled by default]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole=2022-02-12%2005%3A46%3A44=make
HPUX
"heapam_handler.c", line 2434: warning #2940-D: missing return statement at end 
of non-void function "heapam_scan_sample_next_tuple"
"regc_lex.c", line 762: warning #2940-D: missing return statement at end of 
non-void function "lexescape"
"fmgr.c", line 400: warning #2513-D: a value of type "void *" cannot be 
assigned to an entity of type "PGFunction"
"bbstreamer_gzip.c", line 92: warning #2223-D: function "gzopen64" declared 
implicitly

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=tern=2022-02-12%2006%3A48%3A14=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sungazer=2022-02-12%2009%3A26%3A58=make
ppc
auth.c:1876:6: warning: implicit declaration of function 'getpeereid'; did you 
mean 'getpgid'? [-Wimplicit-function-declaration]
missing #include ?

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=pollock=2022-02-12%2004%3A35%3A53=make
GCC10/illumos
pg_shmem.c:326:33: warning: passing argument 1 of 'shmdt' from incompatible 
pointer type [-Wincompatible-pointer-types]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=pogona=2022-02-12%2004%3A10%3A03=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=desmoxytes=2022-02-12%2018%3A24%3A03=make
LLVM: warning: void* memcpy(void*, const void*, size_t) writing to an object of 
type struct std::pair with no trivial 
copy-assignment; use copy-assignment or copy-initialization instead 
[-Wclass-memaccess]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=rorqual=2022-02-12%2018%3A23%3A39=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=serinus=2022-02-12%2005%3A00%3A08=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=calliphoridae=2022-02-12%2004%3A07%3A11=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=culicidae=2022-02-12%2004%3A07%3A13=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=piculet=2022-02-12%2004%3A07%3A25=make

Re: refactoring basebackup.c

2022-02-12 Thread Justin Pryzby
The LZ4 patches caused new compiler warnings.
It's the same issue that was fixed at 71ce8 for gzip.
I think they would've been visible in the CI environment, too.

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=wrasse=2022-02-12%2005%3A08%3A48=make
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/replication/basebackup_lz4.c",
 line 87: warning: Function has no return statement : bbsink_lz4_new

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=bowerbird=2022-02-12%2013%3A11%3A20=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hamerkop=2022-02-12%2010%3A04%3A08=make
warning C4715: 'bbsink_lz4_new': not all control paths return a value

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole=2022-02-12%2005%3A46%3A44=make
"basebackup_lz4.c", line 87: warning #2940-D: missing return statement at end 
of non-void function "bbsink_lz4_new"

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=wrasse=2022-02-12%2005%3A08%3A48=make
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/replication/basebackup_lz4.c",
 line 87: warning: Function has no return statement : bbsink_lz4_new




Re: Use generation context to speed up tuplesorts

2022-02-12 Thread Tomas Vondra
On 1/20/22 08:36, Ronan Dunklau wrote:
> Le jeudi 20 janvier 2022, 02:31:15 CET David Rowley a écrit :
>> On Tue, 18 Jan 2022 at 19:45, Julien Rouhaud  wrote:
>>> I'm also a bit confused about which patch(es) should actually be reviewed
>>> in that thread.  My understanding is that the original patch might not be
>>> relevant anymore but I might be wrong.  The CF entry should probably be
>>> updated to clarify that, with an annotation/ title change / author update
>>> or something.
>>>
>>> In the meantime I will switch the entry to Waiting on Author.
>>
>> I think what's going on here is a bit confusing. There's quite a few
>> patches on here that aim to do very different things.
>>
>> The patch I originally proposed was just to make tuplesort use
>> generation memory contexts. This helps speed up tuplesort because
>> generation contexts allow palloc() to be faster than the standard
>> allocator. Additionally, there's no power-of-2 wastage with generation
>> contexts like there is with the standard allocator. This helps fit
>> more tuples on fewer CPU cache lines.
>>
>> I believe Andres then mentioned that the fixed block size for
>> generation contexts might become a problem. With the standard
>> allocator the actual size of the underlying malloc() grows up until a
>> maximum size.  With generation contexts this is always the same, so we
>> could end up with a very large number of blocks which means lots of
>> small mallocs which could end up slow.  Tomas then posted a series of
>> patches to address this.
>>
>> I then posted another patch that has the planner make a choice on the
>> size of the blocks to use for the generation context based on the
>> tuple width and number of tuple estimates. My hope there was to
>> improve performance further by making a better choice for how large to
>> make the blocks in the first place.  This reduces the need for Tomas'
>> patch, but does not eliminate the need for it.
>>
>> As of now, I still believe we'll need Tomas' patches to allow the
>> block size to grow up to a maximum size.  I think those patches are
>> likely needed before we think about making tuplesort use generation
>> contexts.  The reason I believe this is that we don't always have good
>> estimates for the number of tuples we're going to sort.  nodeSort.c is
>> fairly easy as it just fetches tuples once and then sorts them.  The
>> use of tuplesort for nodeIncrementalSort.c is much more complex as
>> we'll likely be performing a series of smaller sorts which are much
>> harder to get size estimates for. This means there's likely no magic
>> block size that we can use for the generation context that's used to
>> store the tuples in tuplesort.  This is also the case for order by
>> aggregates and probably most other usages of tuplesort.
>>
> 
> You left out the discussion I started about glibc's malloc specific behaviour.
> 
> I tried to demonstrate that a big part of the performance gain you were 
> seeing 
> were not in fact due to our allocator by itself, but by the way different 
> block 
> sizes allocations interact with this malloc implementation and how it handles 
> releasing memory back to the system. I also tried to show that we can give 
> DBAs more control over how much memory to "waste" as the price for faster 
> allocations.
> 
> I agree that the generation context memory manager is useful in the tuplesort 
> context, if only for the fact that we fall back to disk less quickly due to 
> the reduced wastage of memory, but I would be wary of drawing conclusions too 
> quickly based on a specific malloc implementation which shows threshold 
> effects, 
> which are compounded by our growing block sizes code in general.
> 

Yeah, I share this concern - how much of the improvement is real and how
much is due to accidentally not hitting some glibc-specific self-tuning
stuff? Would a realistic workloads get the same improvement or would it
cause regression? What about non-glibc systems with other malloc?

I'm not against pushing the generation context improvements, e.g.
something like the patches posted in [1], because those seem reasonable
in general. But I'm somewhat confused about the last patch (adjusting
allocChunkLimit) causing fairly significant regressions.

I'll try investigating this a bit more next week.


regards


[1]
https://www.postgresql.org/message-id/flat/bcdd4e3e-c12d-cd2b-7ead-a91ad416100a%40enterprisedb.com#a5ecd4e5a9e7d5b07ff25b5684da894f

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Use generation context to speed up tuplesorts

2022-02-12 Thread Tomas Vondra
On 1/25/22 13:34, Ronan Dunklau wrote:
>
> ...
>
> I've run the same 1-to-32-columns sort benchmark, using both glibc malloc and 
> jemalloc, with the standard aset memory manager or with David's generation v2 
> patch.  To use jemalloc, I just use a LD_PRELOAD env variable before starting 
> postgres. 
> 
> I have not tried to measure jemalloc's memory footprint like I did for 
> glibc's 
> malloc in the previous benchmark, as I'm not trying to evaluate jemalloc as a 
> glibc's malloc replacement. 
> 
> Please find the results attached. As I suspected, most of the gain observed 
> with David's patch come from working around glibc's malloc idiosyncracies but 
> the good news is that it also helps (to a lesser degree) with jemalloc.
> 
> I'm not sure how that would behave with growing block sizes though: as Tomas 
> patch and stress-testing benchmarks showed, we can hit some particular 
> thresholds (and regressions) in glibc's self-tuning behaviour.
> 

Interesting results!

So just switching to jemalloc can yield much better performance (in some
cases), even beating the generation context (on top of glibc malloc).

Of course, those are just simple (somewhat synthetic) benchmarks, and
there is no info about memory consumption. Perhaps there are cases where
this would be slower than glibc malloc, and I doubt many people to
switch to a non-default malloc implementation.

But I think in general this mostly confirms our suspicion about the
patch (somewhat accidentally) working around glibc internal behavior.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PGroonga index-only scan problem with yesterday’s PostgreSQL updates

2022-02-12 Thread Anders Kaseorg

On 2/12/22 09:25, Tom Lane wrote:

I don't know that we'd go that far ... maybe if another bad problem
turns up.  In the meantime, though, I do have a modest suggestion:
it would not be too hard to put some defenses in place against another
such bug.  The faulty commit was in our tree for a month and nobody
reported a problem, which is annoying.  Do you want to think about doing
your testing against git branch tips, rather than the last released
versions?  Making a new build every few days would probably be plenty
fast enough.

An even slicker answer would be to set up a PG buildfarm machine
that, in addition to the basic tests, builds PGroonga against the
new PG sources and runs your tests.  Andrew's machine "crake" does
that for RedisFDW and BlackholeFDW, and the source code for at least
the latter module is in the buildfarm client distribution.


I’m a Zulip developer, not a PGroonga developer, but this sounds like a 
good idea to me, so I opened a PR to add the PostgreSQL 15 snapshot 
build to PGroonga’s test matrix.


https://github.com/pgroonga/pgroonga/pull/204

The current snapshot build in the PGDG APT repository is 
15~~devel~20220208.0530-1~541.gitba15f16, predating this fix, despite 
the documentation that they should be built every 6 hours 
(https://wiki.postgresql.org/wiki/Apt/FAQ#Development_snapshots).  But 
it seems the tests have other failures against this snapshot build that 
will need to be investigated.


Anders




[PATCH] Add tests for psql tab completion

2022-02-12 Thread Matheus Alcantara
Hi hackers.

I'm attaching a patch that add some new test cases for tab completion of psql.

This is my first patch that I'm sending here so let me know if I'm doing 
something wrong.

--
Matheus AlcantaraFrom 6af6b972960f4e9017f1c311ee4b13a96d5f66a1 Mon Sep 17 00:00:00 2001
From: Matheus Alcantara 
Date: Sat, 12 Feb 2022 16:45:55 -0300
Subject: [PATCH] psql: Add tests for tab completion

---
 src/bin/psql/t/010_tab_completion.pl | 92 
 1 file changed, 92 insertions(+)

diff --git a/src/bin/psql/t/010_tab_completion.pl b/src/bin/psql/t/010_tab_completion.pl
index 005961f34d..f74ff96226 100644
--- a/src/bin/psql/t/010_tab_completion.pl
+++ b/src/bin/psql/t/010_tab_completion.pl
@@ -45,6 +45,7 @@ $node->safe_psql('postgres',
 	  . "CREATE TABLE mytab246 (f1 int, f2 text);\n"
 	  . "CREATE TABLE \"mixedName\" (f1 int, f2 text);\n"
 	  . "CREATE TYPE enum1 AS ENUM ('foo', 'bar', 'baz', 'BLACK');\n"
+	  . "CREATE INDEX idx_tab1_c1 ON tab1(c1);\n"
 	  . "CREATE PUBLICATION some_publication;\n");
 
 # Developers would not appreciate this test adding a bunch of junk to
@@ -382,6 +383,97 @@ check_completion(
 
 clear_query();
 
+check_completion(
+	"CREATE OR REPLACE \t\t",
+	qr/AGGREGATE +LANGUAGE +RULE +TRIGGER +\r\nFUNCTION +PROCEDURE +TRANSFORM +VIEW/,
+	"check create or replace");
+
+clear_query();
+
+check_completion(
+	"ALTER FOREIGN \t\t",
+	qr/DATA WRAPPER +TABLE/ ,
+	"check alter foreign");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX \t\t",
+	qr/ALL IN TABLESPACE +information_schema. +tab1_pkey\r\nidx_tab1_c1/,
+	"check alter index");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 \t\t",
+	qr/ALTER COLUMN +NO DEPENDS ON EXTENSION +RESET\r\nATTACH PARTITION +OWNER TO +SET\r\nDEPENDS ON EXTENSION +RENAME TO/,
+	"check alter index ");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 ATTACH \t\t",
+	qr/PARTITION/,
+	"check alter index  attach");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 ATTACH PARTITION \t\t",
+	qr/idx_tab1_c1 +public. +\r\ninformation_schema. +tab1_pkey/,
+	"check alter index  ATTACH PARTITION");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 ALTER \t\t",
+	qr/COLUMN/,
+	"check alter index  alter");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 ALTER COLUMN \t\t",
+	qr/1 +SET +STATISTICS/,
+	"check alter index  alter column");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 ALTER COLUMN 1 SET \t\t",
+	qr/STATISTICS/,
+	"check alter index  alter column  set");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 SET \t\t",
+	qr/\( +TABLESPACE/,
+	"check alter index  set");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 RESET \t\t",
+	qr/\( +\a/,
+	"check alter index  reset");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 NO DEPENDS \t\t",
+	qr/ON EXTENSION/,
+	"check alter index  no depends");
+
+clear_query();
+
+check_completion(
+	"ALTER INDEX idx_tab1_c1 DEPENDS \t\t",
+	qr/ON EXTENSION/,
+	"check alter index  depends");
+
+clear_query();
+
 # check VersionedQuery infrastructure
 check_completion(
 	"DROP PUBLIC\t \t\t",
-- 
2.35.1



Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 13:29:54 +0900, Michael Paquier wrote:
> On Sat, Feb 12, 2022 at 09:49:47AM +0900, Michael Paquier wrote:
> > I guess that it would be better to revert the TAP test and rework all
> > that for the time being, then.
> 
> And this part is done.

You wrote in the commit message that:

"However, this is also an issue as a TAP test only offers the build directory
as of TESTDIR in the environment context, so this would fail with VPATH
builds."

Tap tests currently are always executed in the source directory above t/, as
far as I can tell. So I don't think VPATH/non-VPATH or windows / non-windows
would break using a relative reference from the test dir.

Greetings,

Andres Freund




Re: logical decoding and replication of sequences

2022-02-12 Thread Tomas Vondra
On 2/12/22 01:34, Tomas Vondra wrote:
> On 2/10/22 19:17, Tomas Vondra wrote:
>> I've polished & pushed the first part adding sequence decoding
>> infrastructure etc. Attached are the two remaining parts.
>>
>> I plan to wait a day or two and then push the test_decoding part. The
>> last part (for built-in replication) will need more work and maybe
>> rethinking the grammar etc.
>>
> 
> I've pushed the second part, adding sequences to test_decoding.
> 
> Here's the remaining part, rebased, with a small tweak in the TAP test
> to eliminate the issue with not waiting for sequence increments. I've
> kept the tweak in a separate patch, so that we can throw it away easily
> if we happen to resolve the issue.
> 

Hmm, cfbot was not happy about this, so here's a version fixing the
elog() format issue reported by CirrusCI/mingw by ditching the log
message. It was useful for debugging, but otherwise just noise.

I'm a bit puzzled about the macOS failure, though. It seems as if the
test does not wait for the subscriber long enough, but this is with the
tweaked test variant, so it should not have the rollback issue. And I
haven't seen this failure on any other machine.

Regarding adding decoding of sequences to the built-in replication,
there is a couple questions that we need to discuss first before
cleaning up the code etc. Most of them are related to syntax and
handling of various sequence variants.


1) Firstly, what about implicit sequences. That is, if you create a
table with SERIAL or BIGSERIAL column, that'll have a sequence attached.
Should those sequences be added to the publication when the table gets
added? Or should we require adding them separately? Or should that be
specified in the grammar, somehow? Should we have INCLUDING SEQUENCES
for ALTER PUBLICATION ... ADD TABLE ...?

I think we shouldn't require replicating the sequence, because who knows
what the schema is on the subscriber? We want to allow differences, so
maybe the sequence is not there. I'd start with just adding them
separately, because that just seems simpler, but maybe there are good
reasons to support adding them in ADD TABLE.


2) Should it be possible to add sequences that are also associated with
a serial column, without the table being replicated too? I'd say yes, if
people want to do that - I don't think it can cause any issues, and it's
possible to just use sequence directly for non-serial columns anyway.
Which is the same thing, but we can't detect that.


3) What about sequences for UNLOGGED tables? At the moment we don't
allow sequences to be UNLOGGED (Peter revived his patch [1], but that's
not committed yet). Again, I'd say it's up to the user to decide which
sequences are replicated - it's similar to (2).


4) I wonder if we actually want FOR ALL SEQUENCES. On the one hand it'd
be symmetrical with FOR ALL TABLES, which is the other object type we
can replicate. So it'd seem reasonable to handle them in a similar way.
But it's causing some shift/reduce error in the grammar, so it'll need
some changes.



regards


[1]
https://www.postgresql.org/message-id/8da92c1f-9117-41bc-731b-ce1477a77...@enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyFrom 960c90c6849e27eabe09dd3582b6f844c5d5a6a5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Thu, 10 Feb 2022 15:18:59 +0100
Subject: [PATCH 1/2] Add support for decoding sequences to built-in
 replication

---
 doc/src/sgml/catalogs.sgml  |  71 
 doc/src/sgml/ref/alter_publication.sgml |  24 +-
 doc/src/sgml/ref/alter_subscription.sgml|   4 +-
 src/backend/catalog/pg_publication.c| 149 -
 src/backend/catalog/system_views.sql|  10 +
 src/backend/commands/publicationcmds.c  | 350 +++-
 src/backend/commands/sequence.c |  79 +
 src/backend/commands/subscriptioncmds.c | 272 +++
 src/backend/executor/execReplication.c  |   2 +-
 src/backend/nodes/copyfuncs.c   |   1 +
 src/backend/nodes/equalfuncs.c  |   1 +
 src/backend/parser/gram.y   |  32 ++
 src/backend/replication/logical/proto.c |  52 +++
 src/backend/replication/logical/tablesync.c | 114 ++-
 src/backend/replication/logical/worker.c|  60 
 src/backend/replication/pgoutput/pgoutput.c |  85 -
 src/backend/utils/cache/relcache.c  |   4 +-
 src/bin/psql/tab-complete.c |  14 +-
 src/include/catalog/pg_proc.dat |   5 +
 src/include/catalog/pg_publication.h|  14 +
 src/include/commands/sequence.h |   1 +
 src/include/nodes/parsenodes.h  |   6 +
 src/include/replication/logicalproto.h  |  19 ++
 src/include/replication/pgoutput.h  |   1 +
 src/test/regress/expected/rules.out |   8 +
 src/test/subscription/t/028_sequences.pl| 196 +++
 26 files changed, 1538 insertions(+), 36 deletions(-)
 create mode 100644 

Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-12 Thread Christoph Berg
Re: Tom Lane
> Ah.  That seems a bit problematic anyway ... if the version-dependent
> pg_configs don't all return identical results, what's the
> non-version-dependent one supposed to do?

It still returns a correct --includedir for client apps that need it.

(Unfortunately many need --includedir-server since that's where the
type OIDs live.)

Christoph




Re: Re: PGroonga index-only scan problem with yesterday’s PostgreSQL updates

2022-02-12 Thread Tom Lane
Anders Kaseorg  writes:
> On 2/11/22 15:57, Tom Lane wrote:
>> Possibly same issue I just fixed?
>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e5691cc9170bcd6c684715c2755d919c5a16fea2

> Yeah, that does seem to fix my test cases.  Thanks!

> Any chance this will make it into minor releases sooner than three 
> months from now?  I know extra minor releases are unusual, but this 
> regression will be critical at least for self-hosted Zulip users and 
> presumably other PGroonga users.

I don't know that we'd go that far ... maybe if another bad problem
turns up.  In the meantime, though, I do have a modest suggestion:
it would not be too hard to put some defenses in place against another
such bug.  The faulty commit was in our tree for a month and nobody
reported a problem, which is annoying.  Do you want to think about doing
your testing against git branch tips, rather than the last released
versions?  Making a new build every few days would probably be plenty
fast enough.

An even slicker answer would be to set up a PG buildfarm machine
that, in addition to the basic tests, builds PGroonga against the
new PG sources and runs your tests.  Andrew's machine "crake" does
that for RedisFDW and BlackholeFDW, and the source code for at least
the latter module is in the buildfarm client distribution.

regards, tom lane




Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-12 Thread Tom Lane
Christoph Berg  writes:
> I was confusing that with this: The problem that led to the pg_config
> patch years ago was that we have a /usr/bin/pg_config in
> (non-major-version-dependant) libpq-dev, and
> /usr/lib/postgresql/NN/bin/pg_config in the individual
> postgresql-server-dev-NN packages, and iirc the /usr/bin version
> didn't particularly like the other binaries being in
> /usr/lib/postgresql/NN/bin/.

Ah.  That seems a bit problematic anyway ... if the version-dependent
pg_configs don't all return identical results, what's the
non-version-dependent one supposed to do?

regards, tom lane




Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-02-12 Thread Tom Lane
Alexander Lakhin  writes:
> 11.02.2022 05:22, Andres Freund wrote:
>> Over in another thread I made some wild unsubstantiated guesses that the
>> windows issues could have been made much more likely by a somewhat odd bit of
>> code in PQisBusy():
>> https://postgr.es/m/1959196.1644544971%40sss.pgh.pa.us
>> Alexander, any chance you'd try if that changes the likelihood of the problem
>> occurring, without any other fixes / reverts applied?

> Unfortunately I haven't seen an improvement for the test in question.

Yeah, that's what I expected, sadly.  While I think this PQisBusy behavior
is definitely a bug, it will not lead to an infinite loop, just to write
failures being reported in a less convenient fashion than intended.

I wonder whether it would help to put a PQconsumeInput call *before*
the PQisBusy loop, so that any pre-existing EOF condition will be
detected.  If you don't like duplicating code, we could restructure
the loop as

for (;;)
{
intrc;

/* Consume whatever data is available from the socket */
if (PQconsumeInput(streamConn) == 0)
{
/* trouble; return NULL */
return NULL;
}

/* Done? */
if (!PQisBusy(streamConn))
break;

/* Wait for more data */
rc = WaitLatchOrSocket(MyLatch,
   WL_EXIT_ON_PM_DEATH | WL_SOCKET_READABLE |
   WL_LATCH_SET,
   PQsocket(streamConn),
   0,
   WAIT_EVENT_LIBPQWALRECEIVER_RECEIVE);

/* Interrupted? */
if (rc & WL_LATCH_SET)
{
ResetLatch(MyLatch);
ProcessWalRcvInterrupts();
}
}

/* Now we can collect and return the next PGresult */
return PQgetResult(streamConn);


In combination with the PQisBusy fix, this might actually help ...

regards, tom lane




Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-12 Thread Christoph Berg
Re: Tom Lane
> Christoph Berg  writes:
> > I think the "bug" here is that vanilla PG doesn't support installing
> > in FHS locations with /usr/lib and /usr/share split, hence the Debian
> > patch.
> 
> I'm confused by this statement.  An out-of-the-box installation
> produces trees under $PREFIX/lib and $PREFIX/share.  What about
> that is not FHS compliant?  And couldn't you fix it using the
> installation fine-tuning switches that configure already provides?

To support multiple major versions in parallel, we need
/usr/lib/postgresql/NN/lib and /usr/share/postgresql/NN, so it's not a
single $PREFIX. But you are correct, and ./configure supports that.

I was confusing that with this: The problem that led to the pg_config
patch years ago was that we have a /usr/bin/pg_config in
(non-major-version-dependant) libpq-dev, and
/usr/lib/postgresql/NN/bin/pg_config in the individual
postgresql-server-dev-NN packages, and iirc the /usr/bin version
didn't particularly like the other binaries being in
/usr/lib/postgresql/NN/bin/.

I guess it's time to revisit that problem now and see if it can be
solved more pretty today on our side.

Sorry for the rumbling,
Christoph




Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-12 Thread Tom Lane
Christoph Berg  writes:
> I think the "bug" here is that vanilla PG doesn't support installing
> in FHS locations with /usr/lib and /usr/share split, hence the Debian
> patch.

I'm confused by this statement.  An out-of-the-box installation
produces trees under $PREFIX/lib and $PREFIX/share.  What about
that is not FHS compliant?  And couldn't you fix it using the
installation fine-tuning switches that configure already provides?

regards, tom lane




Re: Race condition in TransactionIdIsInProgress

2022-02-12 Thread Simon Riggs
On Fri, 11 Feb 2022 at 19:08, Andres Freund  wrote:

> On 2022-02-11 13:48:59 +, Simon Riggs wrote:
> > On Fri, 11 Feb 2022 at 08:48, Simon Riggs  
> > wrote:
> > >
> > > On Fri, 11 Feb 2022 at 06:11, Andres Freund  wrote:
> > >
> > > > Looks lik syncrep will make this a lot worse, because it can drastically
> > > > increase the window between the TransactionIdCommitTree() and
> > > > ProcArrayEndTransaction() due to the SyncRepWaitForLSN() inbetween.  
> > > > But at
> > > > least it might make it easier to write tests exercising this scenario...
> > >
> > > Agreed
> > >
> > > TransactionIdIsKnownCompleted(xid) is only broken because the single
> > > item cache is set too early in some cases. The single item cache is
> > > important for performance, so we just need to be more careful about
> > > setting the cache.
> >
> > Something like this... fix_cachedFetchXid.v1.patch prevents the cache
> > being set, but this fails! Not worked out why, yet.
>
> I don't think it's safe to check !TransactionIdKnownNotInProgress() in
> TransactionLogFetch(), given that TransactionIdKnownNotInProgress() needs to
> do TransactionLogFetch() internally.

That's not correct because you're confusing
TransactionIdKnownNotInProgress(), which is a new function introduced
by the patch, with the existing function
TransactionIdIsKnownCompleted().


> > just_remove_TransactionIdIsKnownCompleted_call.v1.patch
>
> I think this might contain a different diff than what the name suggests?

Not at all, please don't be confused by the name. The patch removes
the call to TransactionIdIsKnownCompleted() from
TransactionIdIsInProgress().

I'm not sure it is possible to remove TransactionIdIsKnownCompleted()
in backbranches.


> > just removes the known offending call, passes make check, but IMHO
> > leaves the same error just as likely by other callers.
>
> Which other callers are you referring to?

I don't know of any, but we can't just remove a function in a
backbranch, can we?


> To me it seems that the "real" reason behind this specific issue being
> nontrivial to fix and behind the frequent error of calling
> TransactionIdDidCommit() before TransactionIdIsInProgress() is
> that we've really made a mess out of the transaction status determination code
> / API. IMO the original sin is requiring the complicated
>
> if (TransactionIdIsCurrentTransactionId(xid)
> ...
> else if (TransactionIdIsInProgress(xid)
> ...
> else if (TransactionIdDidCommit(xid)
> ...
>
> dance at pretty much any place checking transaction status.

Agreed, it's pretty weird to have to call the functions in the right
order or you get the wrong answer. Especially since we have no
cross-check to ensure the correct sequence was followed.


> The multiple calls
> for the same xid is, I think, what pushed the caching down to the
> TransactionLogFetch level.
>
> ISTM that we should introduce something like TransactionIdStatus(xid) that
> returns
> XACT_STATUS_CURRENT
> XACT_STATUS_IN_PROGRESS
> XACT_STATUS_COMMITTED
> XACT_STATUS_ABORTED
> accompanied by a TransactionIdStatusMVCC(xid, snapshot) that checks
> XidInMVCCSnapshot() instead of TransactionIdIsInProgress().
>
> I think that'd also make visibility checks faster - we spend a good chunk of
> cycles doing unnecessary function calls and repeatedly gathering information.
>
>
> It's also kind of weird that TransactionIdIsCurrentTransactionId() isn't more
> optimized - TransactionIdIsInProgress() knows it doesn't need to check
> anything before RecentXmin, but TransactionIdIsCurrentTransactionId()
> doesn't. Nor does TransactionIdIsCurrentTransactionId() check if the xid is
> smaller than GetTopTransactionIdIfAny() before bsearching through subxids.
>
> Of course anything along these lines would never be backpatchable...

Agreed

I've presented a simple patch intended for backpatching. You may not
like the name, but please have another look at
"just_remove_TransactionIdIsKnownCompleted_call.v1.patch".
I believe it is backpatchable with minimal impact and without loss of
performance.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-12 Thread Robert Haas
On Fri, Feb 11, 2022 at 11:39 PM Michael Paquier  wrote:
> Perhaps you'd better check that 'decompress_program' can be executed
> and skip things if the command is not around?  Note that
> 020_pg_receivewal.pl does an extra lz4 --version as a safety measure,
> but 008_untar.pl does not.

You may be right, but I'm not entirely convinced. $ENV{'LZ4'} here is
being set by make, and make is setting it to whatever configure found.
If configure found a version of the lz4 program that doesn't actually
work, isn't that configure's fault, or the system administrator's
fault, rather than this test's fault?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-02-12 Thread Bharath Rupireddy
On Thu, Feb 10, 2022 at 9:55 PM Peter Geoghegan  wrote:
>
> On Sun, Feb 6, 2022 at 7:45 AM Robert Haas  wrote:
> > For what it's worth, I am generally in favor of having something like
> > this in PostgreSQL. I think it's wrong of us to continue assuming that
> > everyone has command-line access. Even when that's true, it's not
> > necessarily convenient. If you choose to use a relational database,
> > you may be the sort of person who likes SQL. And if you are, you may
> > want to have the database tell you what's going on via SQL rather than
> > command-line tools or operating system utilities. Imagine if we didn't
> > have pg_stat_activity and you had to get that information by running a
> > separate binary. Would anyone like that? Why is this case any
> > different?
>
> +1. An SQL interface is significantly easier to work with. Especially
> because it can use the built-in LSN type, pg_lsn.
>
> I don't find the slippery slope argument convincing. There aren't that
> many other things that are like pg_waldump, but haven't already been
> exposed via an SQL interface. Offhand, I can't think of any.

On Sat, Feb 12, 2022 at 4:03 AM Andrew Dunstan  wrote:
>
> Almost completely off topic, but this reminded me of an incident about
> 30 years ago at my first gig as an SA/DBA. There was an application
> programmer who insisted on loading a set of values from a text file into
> a temp table (it was Ingres, anyone remember that?). Why? Because he
> knew how to write "Select * from mytable order by mycol" but didn't know
> how to drive the Unix sort utility at the command line. When I was
> unable to restrain myself from smiling at this he got very angry and
> yelled at me loudly.
>
> So, yes, some people do like SQL and hate the command line.

Thanks a lot  for the comments. I'm looking forward to the review of
the latest v4 patches posted at [1].

[1] 
https://www.postgresql.org/message-id/CALj2ACUS9%2B54QGPtUjk76dcYW-AMKp3hPe-U%2BpQo2-GpE4kjtA%40mail.gmail.com

Regards,
Bharath Rupireddy.




Re: Accommodate startup process in a separate ProcState array slot instead of in MaxBackends slots.

2022-02-12 Thread Bharath Rupireddy
On Fri, Feb 11, 2022 at 7:56 PM Yura Sokolov  wrote:
>
> В Сб, 16/10/2021 в 16:37 +0530, Bharath Rupireddy пишет:
> > On Thu, Oct 14, 2021 at 10:56 AM Fujii Masao
> >  wrote:
> > > On 2021/10/12 15:46, Bharath Rupireddy wrote:
> > > > On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao 
> > > >  wrote:
> > > > > On 2021/10/12 4:07, Bharath Rupireddy wrote:
> > > > > > Hi,
> > > > > >
> > > > > > While working on [1], it is found that currently the ProcState array
> > > > > > doesn't have entries for auxiliary processes, it does have entries 
> > > > > > for
> > > > > > MaxBackends. But the startup process is eating up one slot from
> > > > > > MaxBackends. We need to increase the size of the ProcState array by 
> > > > > > 1
> > > > > > at least for the startup process. The startup process uses ProcState
> > > > > > slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> > > > > > The procState array size is initialized to MaxBackends in
> > > > > > SInvalShmemSize.
> > > > > >
> > > > > > The consequence of not fixing this issue is that the database may 
> > > > > > hit
> > > > > > the error "sorry, too many clients already" soon in
> > > > > > SharedInvalBackendInit.
> > >
> > > On second thought, I wonder if this error could not happen in practice. 
> > > No?
> > > Because autovacuum doesn't work during recovery and the startup process
> > > can safely use the ProcState entry for autovacuum worker process.
> > > Also since the minimal allowed value of autovacuum_max_workers is one,
> > > the ProcState array guarantees to have at least one entry for autovacuum 
> > > worker.
> > >
> > > If this understanding is right, we don't need to enlarge the array and
> > > can just update the comment. I don't strongly oppose to enlarge
> > > the array in the master, but I'm not sure it's worth doing that
> > > in back branches if the issue can cause no actual error.
> >
> > Yes, the issue can't happen. The comment in the SInvalShmemSize,
> > mentioning about the startup process always having an extra slot
> > because the autovacuum worker is not active during recovery, looks
> > okay. But, is it safe to assume that always? Do we have a way to
> > specify that in the form an Assert(when_i_am_startup_proc &&
> > autovacuum_not_running) (this looks a bit dirty though)? Instead, we
> > can just enlarge the array in the master and be confident about the
> > fact that the startup process always has one dedicated slot.
>
> But this slot wont be used for most of cluster life. It will be just
> waste.

Correct. In the standby autovacuum launcher and worker are not started
so, the startup process will always have a slot free for it to use.

> And `Assert(there_is_startup_proc && autovacuum_not_running)` has
> value on its own, hasn't it? So why doesn't add it with comment.

Assertion doesn't make sense to me now. Because the postmaster ensures
that the autovacuum launcher/workers will not get started in standby
mode and we can't reliably know in InitRecoveryTransactionEnvironment
(startup process) whether or not autovacuum launcher process has been
started.

FWIW, here's a patch just adding a comment on how the startup process
can get a free procState array slot even when SInvalShmemSize hasn't
accounted for it.

Regards,
Bharath Rupireddy.


v1-0001-Add-comment-about-startup-process-getting-procSta.patch
Description: Binary data


Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-02-12 Thread Alexander Lakhin
Hello Andres,
11.02.2022 05:22, Andres Freund wrote:
> Over in another thread I made some wild unsubstantiated guesses that the
> windows issues could have been made much more likely by a somewhat odd bit of
> code in PQisBusy():
>
> https://postgr.es/m/1959196.1644544971%40sss.pgh.pa.us
>
> Alexander, any chance you'd try if that changes the likelihood of the problem
> occurring, without any other fixes / reverts applied?
Unfortunately I haven't seen an improvement for the test in question.
With the PQisBusy-fix.patch from [1] and without any other changes on
the master branch (52377bb8) it still fails (on iterations 13, 5, 2, 2
for me).
The diagnostic logging (in attachment) added:
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_receive: PQgetCopyData
returned 0
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_receive: PQgetCopyData
2 returned -1
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_receive:
end-of-streaming or error: -1
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_PQgetResult:
streamConn->asyncStatus: 1 && streamConn->status: 0
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_receive
libpqrcv_PQgetResult returned 10551584, 1
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_receive
libpqrcv_PQgetResult PGRES_COMMAND_OK
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_PQgetResult:
streamConn->asyncStatus: 1 && streamConn->status: 0
2022-02-12 01:04:16.341 PST [4912] LOG:  libpqrcv_PQgetResult loop
before WaitLatchOrSocket
2022-02-12 01:04:16.341 PST [4912] LOG:  WSAEventSelect event->fd: 948,
flags: 21
2022-02-12 01:04:16.341 PST [4912] LOG:  WaitLatchOrSocket before
WaitEventSetWait
2022-02-12 01:04:16.341 PST [4912] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock
2022-02-12 01:04:16.341 PST [4912] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects: 3
...
shows that before the doomed WaitForMultipleObjects() call the field
conn->status is 0 (CONNECTION_OK).

[1] https://www.postgresql.org/message-id/2187263.1644616494%40sss.pgh.pa.us

Best regards,
Alexander
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 0d89db4e6a..0776885306 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -34,6 +34,9 @@
 #include "utils/pg_lsn.h"
 #include "utils/tuplestore.h"
 
+#undef ENABLE_THREAD_SAFETY
+#include "libpq-int.h"
+
 PG_MODULE_MAGIC;
 
 void		_PG_init(void);
@@ -696,10 +699,13 @@ libpqrcv_PQgetResult(PGconn *streamConn)
 	 * Collect data until PQgetResult is ready to get the result without
 	 * blocking.
 	 */
+elog(LOG, "libpqrcv_PQgetResult: streamConn->asyncStatus: %d && streamConn->status: %d", streamConn->asyncStatus, streamConn->status);
+
 	while (PQisBusy(streamConn))
 	{
 		int			rc;
 
+elog(LOG, "libpqrcv_PQgetResult loop before WaitLatchOrSocket");
 		/*
 		 * We don't need to break down the sleep into smaller increments,
 		 * since we'll get interrupted by signals and can handle any
@@ -711,6 +717,7 @@ libpqrcv_PQgetResult(PGconn *streamConn)
 			   PQsocket(streamConn),
 			   0,
 			   WAIT_EVENT_LIBPQWALRECEIVER_RECEIVE);
+elog(LOG, "libpqrcv_PQgetResult loop after WaitLatchOrSocket: %d");
 
 		/* Interrupted? */
 		if (rc & WL_LATCH_SET)
@@ -771,6 +778,7 @@ libpqrcv_receive(WalReceiverConn *conn, char **buffer,
 
 	/* Try to receive a CopyData message */
 	rawlen = PQgetCopyData(conn->streamConn, >recvBuf, 1);
+elog(LOG, "libpqrcv_receive: PQgetCopyData returned %d", rawlen);
 	if (rawlen == 0)
 	{
 		/* Try consuming some data. */
@@ -782,6 +790,7 @@ libpqrcv_receive(WalReceiverConn *conn, char **buffer,
 
 		/* Now that we've consumed some input, try again */
 		rawlen = PQgetCopyData(conn->streamConn, >recvBuf, 1);
+elog(LOG, "libpqrcv_receive: PQgetCopyData 2 returned %d", rawlen);
 		if (rawlen == 0)
 		{
 			/* Tell caller to try again when our socket is ready. */
@@ -791,15 +800,20 @@ libpqrcv_receive(WalReceiverConn *conn, char **buffer,
 	}
 	if (rawlen == -1)			/* end-of-streaming or error */
 	{
+elog(LOG, "libpqrcv_receive: end-of-streaming or error: %d", rawlen);
+
 		PGresult   *res;
 
 		res = libpqrcv_PQgetResult(conn->streamConn);
+elog(LOG, "libpqrcv_receive libpqrcv_PQgetResult returned %d, %d", res, PQresultStatus(res));
 		if (PQresultStatus(res) == PGRES_COMMAND_OK)
 		{
+elog(LOG, "libpqrcv_receive libpqrcv_PQgetResult PGRES_COMMAND_OK");
 			PQclear(res);
 
 			/* Verify that there are no more results. */
 			res = libpqrcv_PQgetResult(conn->streamConn);
+elog(LOG, "libpqrcv_receive libpqrcv_PQgetResult 2 returned %p", res);
 			if (res != NULL)
 			{
 PQclear(res);
@@ -809,6 +823,7 @@ libpqrcv_receive(WalReceiverConn *conn, char **buffer,
  * we'd seen an error, or PGRES_COPY_IN) don't report an error
  * here, but let callers deal with it.
  */
+elog(LOG, "libpqrcv_receive PQstatus(conn->streamConn) returned %d", 

Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-12 Thread Christoph Berg
Re: Michael Paquier
> On Fri, Feb 11, 2022 at 10:48:11AM +0100, Christoph Berg wrote:
> > It never caused any problem in the 12+ years we have been doing this,
> > but Debian is patching pg_config not to be relocatable since we are
> > installing into /usr/lib/postgresql/NN /usr/share/postgresql/NN, so
> > not a single prefix.
> > 
> > https://salsa.debian.org/postgresql/postgresql/-/blob/15/debian/patches/50-per-version-dirs.patch
> 
> Wow.  This is the way for Debian to bypass the fact that ./configure
> is itself patched, hence you cannot rely on things like --libdir,
> --bindir and the kind at build time?  That's..  Err..  Fancy, I'd
> say.

--libdir and --bindir will point at the final install locations.

I think the "bug" here is that vanilla PG doesn't support installing
in FHS locations with /usr/lib and /usr/share split, hence the Debian
patch.

Christoph