Re: missing documentation for streaming in-progress transactions

2021-04-08 Thread Amit Kapila
On Fri, Apr 9, 2021 at 9:58 AM Justin Pryzby  wrote:
>
> On Wed, Apr 07, 2021 at 05:45:16PM +0530, Amit Kapila wrote:
> > On Wed, Apr 7, 2021 at 1:11 PM Ajin Cherian  wrote:
> > >
> > > Found that some documentation hasn't been updated for the changes made as 
> > > part of
> > > streaming large in-progress transactions. Attached a patch that adds the 
> > > missing changes. Let me know if anything more needs to be added or any 
> > > comments on this change.
> > >
> >
> > Thanks, this mostly looks good to me. I have slightly modified it.
> > See, what do you think of the attached?
>
> +   Protocol version. Currently versions 1 and
> +   2 are supported. The version 2
> +   is supported only for server versions 14 and above, and is used to 
> allow
> +   streaming of large in-progress transactions.
>
> The diff briefly confused me, since this is in protocol.sgml, and since the
> libpq protocol version is 1/2/3, with 2 being removed in v14 (3174d69fb).
> I suggest to say "replication protocol version 2".
>
> I realize that the headings make this more clear when reading the .html, so
> maybe there's no issue.
>

Yeah, this was the reason to not include replication. If one is
reading the document or even *.sgml, there shouldn't be any confusion
but if you or others feel so, we can use 'replication' as well.

-- 
With Regards,
Amit Kapila.




Re: doc review for v14

2021-04-08 Thread Michael Paquier
On Thu, Apr 08, 2021 at 11:40:08AM -0500, Justin Pryzby wrote:
> Another round of doc review, not yet including all of yesterday's commits.

Thanks for compiling all that.  I got through the whole set and
applied the most relevant parts on HEAD.  Some of them applied down to
9.6, so I have fixed it down where needed, for the parts that did not
conflict too heavily.
--
Michael


signature.asc
Description: PGP signature


Re: Simplify backend terminate and wait logic in postgres_fdw test

2021-04-08 Thread Bharath Rupireddy
On Fri, Apr 9, 2021 at 7:29 AM Michael Paquier  wrote:
>
> On Fri, Apr 09, 2021 at 06:53:21AM +0530, Bharath Rupireddy wrote:
> > I didn't think of the warning cases, my bad. How about using SET
> > client_min_messages = 'ERROR'; before we call
> > pg_wait_for_backend_termination? We can only depend on the return
> > value of pg_wait_for_backend_termination, when true we  can exit. This
> > way the buildfarm will not see warnings. Thoughts?
>
> You could do that, but I would also bet that this is going to get
> forgotten in the future if this gets extended in more SQL tests that
> are output-sensitive, in or out of core.  Honestly, I can get behind a
> warning in pg_wait_for_backend_termination() to inform that the
> process poked at is not a PostgreSQL one, because it offers new and
> useful information to the user.  But, and my apologies for sounding a
> bit noisy, I really don't get why pg_wait_until_termination() has any
> need to do that.  From what I can see, it provides the following
> information:
> - A PID, that we already know from the caller or just from
> pg_stat_activity.
> - A timeout, already known as well.
> - The fact that the process did not terminate, information given by
> the "false" status, only used in this case.
>
> So there is no new information here to the user, only a duplicate of
> what's already known to the caller of this function.  I see more
> advantages in removing this WARNING rather than keeping it.

IMO it does make sense to provide a warning for a bool returning
function, if there are multiple situations in which the function
returns false. This will give clear information as to why the false is
returned.

pg_terminate_backend: false is returned 1) when the process with given
pid is not a backend(warning "PID %d is not a PostgreSQL server
process") 2) if the kill() fails(warning "could not send signal to
process %d: %m") 3) if the timeout is specified and the backend is not
terminated within it(warning "backend with PID %d did not terminate
within %lld milliseconds").
pg_cancel_backend: false is returned 1) when the process with the
given pid is not a backend 2) if the kill() fails.
pg_wait_for_backend_termination: false is returned 1) when the process
with a given pid is not a backend 2) the backend is not terminated
within the timeout.

If we ensure that all the above functions return false in only one
situation and error in all other situations, then removing warnings
makes sense.

Having said above, there seems to be a reason for issuing a warning
and returning false instead of error, that is the callers can just
call these functions in a loop until they return true. See the below
comments:
/*
 * This is just a warning so a loop-through-resultset will not abort
 * if one backend terminated on its own during the run.
 */
/* Again, just a warning to allow loops */

I would like to keep the behaviour of these functions as-is.

> You could do that, but I would also bet that this is going to get
> forgotten in the future if this gets extended in more SQL tests that
> are output-sensitive, in or out of core

On the above point of hackers (wanting to use these functions in more
SQL tests) forgetting that the functions pg_terminate_backend,
pg_cancel_backend, pg_wait_for_backend_termination issue a warning in
some cases which might pollute the tests if used without suppressing
these warnings, I feel it is best left to the patch implementers and
the reviewers. On our part, we mentioned that the functions
pg_terminate_backend and pg_wait_for_backend_termination would emit a
warning on timeout "On timeout a warning is emitted and
false is returned"

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread David Rowley
On Fri, 9 Apr 2021 at 13:52, Michael Paquier  wrote:
>
> On Thu, Apr 08, 2021 at 09:21:30PM -0400, Alvaro Herrera wrote:
> > On 2021-Apr-08, Tom Lane wrote:
> >> Maybe like
> >>
> >> case T_ScalarArrayOpExpr:
> >> /* ScalarArrayOpExpr's result is boolean ... */
> >> coll = InvalidOid;  /* ... so it has no collation 
> >> */
> >> break;
> >
> > This is much clearer, yeah.
>
> +1.

Yeah, that's much better.

For the exprSetCollation case, I ended up with:

 case T_ScalarArrayOpExpr:
 /* ScalarArrayOpExpr's result is boolean ... */
 Assert(!OidIsValid(collation)); /* ... so never
set a collation */

I wanted something more like /* ... so we must never set a collation
*/ but that put the line longer than 80. I thought wrapping to a 2nd
line was excessive, so I shortened it to that.

David


improve_possibly_misleading_comments_in_nodefuncs.patch
Description: Binary data


Re: missing documentation for streaming in-progress transactions

2021-04-08 Thread Justin Pryzby
On Wed, Apr 07, 2021 at 05:45:16PM +0530, Amit Kapila wrote:
> On Wed, Apr 7, 2021 at 1:11 PM Ajin Cherian  wrote:
> >
> > Found that some documentation hasn't been updated for the changes made as 
> > part of
> > streaming large in-progress transactions. Attached a patch that adds the 
> > missing changes. Let me know if anything more needs to be added or any 
> > comments on this change.
> >
> 
> Thanks, this mostly looks good to me. I have slightly modified it.
> See, what do you think of the attached?

+   Protocol version. Currently versions 1 and
+   2 are supported. The version 2
+   is supported only for server versions 14 and above, and is used to allow
+   streaming of large in-progress transactions.

The diff briefly confused me, since this is in protocol.sgml, and since the
libpq protocol version is 1/2/3, with 2 being removed in v14 (3174d69fb).
I suggest to say "replication protocol version 2".

I realize that the headings make this more clear when reading the .html, so
maybe there's no issue.

-- 
Justin




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2021-04-08 Thread Justin Pryzby
Breaking with tradition, the previous patch included one too *few* changes, and
failed to resolve the OID collisions.
>From d3d33b25e8571f5a2a3124e85164321f19ca Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 16 Mar 2020 14:12:55 -0500
Subject: [PATCH v28 01/11] Document historic behavior of links to
 directories..

Backpatch to 9.5: pg_stat_file
---
 doc/src/sgml/func.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0606b6a9aa..aa0dcde886 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27018,7 +27018,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).


 This function is restricted to superusers by default, but other users
-- 
2.17.0

>From c6300e4cc9fd6826530ac5a5c49eaac7f02c49c0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 17 Mar 2020 13:16:24 -0500
Subject: [PATCH v28 02/11] Add tests on pg_ls_dir before changing it

---
 src/test/regress/expected/misc_functions.out | 24 
 src/test/regress/sql/misc_functions.sql  |  8 +++
 2 files changed, 32 insertions(+)

diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e845042d38..ea0fc48dbd 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -214,6 +214,30 @@ select count(*) > 0 from
  t
 (1 row)
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+ name 
+--
+ .
+(1 row)
+
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+ name 
+--
+(0 rows)
+
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+ pg_ls_dir 
+---
+(0 rows)
+
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+ERROR:  could not open directory "does not exist": No such file or directory
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+ size | access | modification | change | creation | isdir 
+--++--++--+---
+(0 rows)
+
 --
 -- Test adding a support function to a subject function
 --
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index a398349afc..eb6ac12ab4 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -69,6 +69,14 @@ select count(*) > 0 from
where spcname = 'pg_default') pts
   join pg_database db on pts.pts = db.oid;
 
+select * from (select pg_ls_dir('.', false, true) as name) as ls where ls.name='.'; -- include_dot_dirs=true
+select * from (select pg_ls_dir('.', false, false) as name) as ls where ls.name='.'; -- include_dot_dirs=false
+select pg_ls_dir('does not exist', true, false); -- ok with missingok=true
+select pg_ls_dir('does not exist'); -- fails with missingok=false
+
+-- Check that expected columns are present
+select * from pg_stat_file('.') limit 0;
+
 --
 -- Test adding a support function to a subject function
 --
-- 
2.17.0

>From 5d352f9fee18f44a03f5c373b4aec71f88933402 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 9 Mar 2020 22:40:24 -0500
Subject: [PATCH v28 03/11] Add pg_ls_dir_metadata to list a dir with file
 metadata..

Generalize pg_ls_dir_files and retire pg_ls_dir

Need catversion bumped?
---
 doc/src/sgml/func.sgml   |  21 ++
 src/backend/catalog/system_views.sql |   1 +
 src/backend/utils/adt/genfile.c  | 233 +++
 src/include/catalog/pg_proc.dat  |  12 +
 src/test/regress/expected/misc_functions.out |  24 ++
 src/test/regress/input/tablespace.source |   5 +
 src/test/regress/output/tablespace.source|   8 +
 src/test/regress/sql/misc_functions.sql  |  11 +
 8 files changed, 222 insertions(+), 93 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index aa0dcde886..507a6d73f8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25821,6 +25821,27 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());

   
 
+  
+   
+
+ pg_ls_dir_metadata
+
+pg_ls_dir_metadata ( dirname text
+, missing_ok boolean,
+include_dot_dirs boolean  )
+setof record
+( filename text,
+size bigint,
+modification timestamp with time zone )
+   
+   
+For each file in the specified directory, list the file and its
+

Re: missing documentation for streaming in-progress transactions

2021-04-08 Thread Amit Kapila
On Fri, Apr 9, 2021 at 8:29 AM Ajin Cherian  wrote:
>
> On Fri, Apr 9, 2021 at 10:23 AM Euler Taveira  wrote:
>>
>>
>> I didn't like this style because it is not descriptive enough. It is also 
>> not a
>> style adopted by Postgres. I suggest to add something like "This field was
>> introduced in version 2" or "This field is available since version 2" after 
>> the
>> field description.
>
>
> I have updated this to  "Since protocol version 2"
>>
>>
>> +Xid of the sub-transaction (will be same as xid of the 
>> transaction for top-level
>> +transactions).
>> +
>>
>> Although, sub-transaction is also used in the documentation, I suggest to use
>> subtransaction. Maybe change the other sub-transaction occurrences too.
>
>
> Updated.
>

I don't like repeating the same thing for all new messages. So added
separate para for the same and few other changes. See what do you
think of the attached?

-- 
With Regards,
Amit Kapila.


v5-0001-doc-Update-information-of-new-messages-for-logica.patch
Description: Binary data


Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)

2021-04-08 Thread Thomas Munro
On Wed, Mar 31, 2021 at 7:02 PM Thomas Munro  wrote:
> On Fri, Mar 12, 2021 at 7:55 PM Thomas Munro  wrote:
> > On Thu, Mar 11, 2021 at 7:34 PM Michael Paquier  wrote:
> > > Wow.  This probably means that we would be able to get rid of
> > > USE_POSTMASTER_DEATH_SIGNAL?
> >
> > We'll still need it, because there'd still be systems with no signal:
> > NetBSD, OpenBSD, AIX, HPUX, illumos.
>
> Erm, and of course macOS.
>
> There is actually something we could do here: ioctl(I_SETSIG) for
> SysV-derived systems and fcntl(O_ASYNC) for BSD-derived systems seems
> like a promising way to get a SIGIO signal when the postmaster goes
> away and the pipe becomes readable.  Previously I'd discounted this,
> because it's not in POSIX and I doubted it would work well on other
> systems.  But I was flicking through Stevens' UNIX book while trying
> to figure out that POLLHUP stuff from a nearby thread (though it's
> useless for that purpose) and I learned from section 12.6 that SIGIO
> is fairly ancient, originating in 4.2BSD, adopted by SVR4, so it's
> likely present in quite a few systems, maybe even all of our support
> platforms if we're prepared to do it two different ways.  Just a
> thought.

Alright, here's a proof-of-concept patch that does that.  Adding to the next CF.

This seems to work on Linux, macOS, FreeBSD and OpenBSD (and I assume
any other BSD).  Can anyone tell me if it works on illumos, AIX or
HPUX, and if not, how to fix it or disable the feature gracefully?
For now the patch assumes that if you have SIGIO then you can do this;
perhaps it should also test for O_ASYNC?  Perhaps HPUX has the signal
but requires a different incantation with I_SETSIG?

Full disclosure: The reason for my interest in this subject is that I
have a work-in-progress patch set to make latches, locks and condition
variables more efficient using futexes on several OSes, but it needs a
signal to wake on postmaster death.
From acff1b99d464eca453cdde3f5ca8079c925e47ac Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 9 Apr 2021 16:00:07 +1200
Subject: [PATCH] Use SIGIO to detect postmaster death.

Previously we used non-POSIX calls prctl/procctl to requests a signal on
death of our parent process on Linux and FreeBSD.  Instead, use O_ASYNC
to request SIGIO when the pipe is closed by the postmaster.  The effect
is about the same, but it should work on many other Unix systems too.
It's not in POSIX either, so it remains optional.

Discussion: https://postgr.es/m/CA%2BhUKGJiejg%3DGPTkf3S53N0-Vr4fOQ4wev7DmAVVLHbh%3DO9%2Bdg%40mail.gmail.com
---
 configure  |  2 +-
 configure.ac   |  2 --
 src/backend/storage/ipc/pmsignal.c | 41 --
 src/include/pg_config.h.in |  6 -
 src/include/storage/pmsignal.h | 11 +---
 src/tools/msvc/Solution.pm |  2 --
 6 files changed, 12 insertions(+), 52 deletions(-)

diff --git a/configure b/configure
index 70f4555264..9b69fb203a 100755
--- a/configure
+++ b/configure
@@ -13345,7 +13345,7 @@ $as_echo "#define HAVE_STDBOOL_H 1" >>confdefs.h
 fi
 
 
-for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/uio.h sys/un.h termios.h ucred.h wctype.h
+for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/uio.h sys/un.h termios.h ucred.h wctype.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
diff --git a/configure.ac b/configure.ac
index ba67c95bcc..f08c60acd6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1360,8 +1360,6 @@ AC_CHECK_HEADERS(m4_normalize([
 	sys/epoll.h
 	sys/event.h
 	sys/ipc.h
-	sys/prctl.h
-	sys/procctl.h
 	sys/pstat.h
 	sys/resource.h
 	sys/select.h
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 280c2395c9..e9f5b4fc41 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include 
 #include 
 #include 
 
@@ -92,23 +93,6 @@ postmaster_death_handler(int signo)
 {
 	postmaster_possibly_dead = true;
 }
-
-/*
- * The available signals depend on the OS.  SIGUSR1 and SIGUSR2 are already
- * used for other things, so choose another one.
- *
- * Currently, we assume that we can always find a signal to use.  That
- * seems like a reasonable assumption for all platforms that are modern
- * enough to have a parent-death signaling mechanism.
- */
-#if defined(SIGINFO)
-#define POSTMASTER_DEATH_SIGNAL SIGINFO
-#elif defined(SIGPWR)
-#define POSTMASTER_DEATH_SIGNAL SIGPWR
-#else
-#error "cannot find a signal to use for 

Re: [POC] Fast COPY FROM command for the table with foreign partitions

2021-04-08 Thread Etsuro Fujita
On Fri, Apr 9, 2021 at 9:49 AM Justin Pryzby  wrote:
> I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.

Thanks for rebasing!

Actually, I've started reviewing this, but I couldn't finish my
review.  My apologies for not having much time on this.  I'll continue
to work on it for PG15.

Sorry for the empty email.

Best regards,
Etsuro Fujita




Re: [POC] Fast COPY FROM command for the table with foreign partitions

2021-04-08 Thread Etsuro Fujita
On Fri, Apr 9, 2021 at 9:49 AM Justin Pryzby  wrote:
>
> I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.
>
> --
> Justin




Re: WIP: WAL prefetch (another approach)

2021-04-08 Thread Justin Pryzby
Here's some little language fixes.

BTW, before beginning "recovery", PG syncs all the data dirs.
This can be slow, and it seems like the slowness is frequently due to file
metadata.  For example, that's an obvious consequence of an OS crash, after
which the page cache is empty.  I've made a habit of running find /zfs -ls |wc
to pre-warm it, which can take a little bit, but then the recovery process
starts moments later.  I don't have any timing measurements, but I expect that
starting to stat() all data files as soon as possible would be a win.

commit cc9707de333fe8242607cde9f777beadc68dbf04
Author: Justin Pryzby 
Date:   Thu Apr 8 10:43:14 2021 -0500

WIP: doc review: Optionally prefetch referenced data in recovery.

1d257577e08d3e598011d6850fd1025858de8c8c

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bc4a8b2279..139dee7aa2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3621,7 +3621,7 @@ include_dir 'conf.d'
 pool after that.  However, on file systems with a block size larger
 than
 PostgreSQL's, prefetching can avoid a
-costly read-before-write when a blocks are later written.
+costly read-before-write when blocks are later written.
 The default is off.

   
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 24cf567ee2..36e00c92c2 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -816,9 +816,7 @@
prefetching mechanism is most likely to be effective on systems
with full_page_writes set to
off (where that is safe), and where the working
-   set is larger than RAM.  By default, prefetching in recovery is enabled
-   on operating systems that have posix_fadvise
-   support.
+   set is larger than RAM.  By default, prefetching in recovery is disabled.
   
  
 
diff --git a/src/backend/access/transam/xlogprefetch.c 
b/src/backend/access/transam/xlogprefetch.c
index 28764326bc..363c079964 100644
--- a/src/backend/access/transam/xlogprefetch.c
+++ b/src/backend/access/transam/xlogprefetch.c
@@ -31,7 +31,7 @@
  * stall; this is counted with "skip_fpw".
  *
  * The only way we currently have to know that an I/O initiated with
- * PrefetchSharedBuffer() has that recovery will eventually call ReadBuffer(),
+ * PrefetchSharedBuffer() has that recovery will eventually call ReadBuffer(), 
XXX: what ??
  * and perform a synchronous read.  Therefore, we track the number of
  * potentially in-flight I/Os by using a circular buffer of LSNs.  When it's
  * full, we have to wait for recovery to replay records so that the queue
@@ -660,7 +660,7 @@ XLogPrefetcherScanBlocks(XLogPrefetcher *prefetcher)
/*
 * I/O has possibly been initiated (though we don't 
know if it was
 * already cached by the kernel, so we just have to 
assume that it
-* has due to lack of better information).  Record this 
as an I/O
+* was due to lack of better information).  Record this 
as an I/O
 * in progress until eventually we replay this LSN.
 */
XLogPrefetchIncrement(>prefetch);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 090abdad8b..8c72ba1f1a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2774,7 +2774,7 @@ static struct config_int ConfigureNamesInt[] =
{
{"wal_decode_buffer_size", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
gettext_noop("Maximum buffer size for reading ahead in 
the WAL during recovery."),
-   gettext_noop("This controls the maximum distance we can 
read ahead n the WAL to prefetch referenced blocks."),
+   gettext_noop("This controls the maximum distance we can 
read ahead in the WAL to prefetch referenced blocks."),
GUC_UNIT_BYTE
},
_decode_buffer_size,




Re: TRUNCATE on foreign table

2021-04-08 Thread Kohei KaiGai
2021年4月8日(木) 22:14 Fujii Masao :
>
> On 2021/04/08 22:02, Kohei KaiGai wrote:
> >> Anyway, attached is the updated version of the patch. This is still based 
> >> on the latest Kazutaka-san's patch. That is, extra list for ONLY is still 
> >> passed to FDW. What about committing this version at first? Then we can 
> >> continue the discussion and change the behavior later if necessary.
>
> Pushed! Thank all involved in this development!!
> For record, I attached the final patch I committed.
>
>
> > Ok, it's fair enought for me.
> >
> > I'll try to sort out my thought, then raise a follow-up discussion if 
> > necessary.
>
> Thanks!
>
> The followings are the open items and discussion points that I'm thinking of.
>
> 1. Currently the extra information (TRUNCATE_REL_CONTEXT_NORMAL, 
> TRUNCATE_REL_CONTEXT_ONLY or TRUNCATE_REL_CONTEXT_CASCADING) about how a 
> foreign table was specified as the target to truncate in TRUNCATE command is 
> collected and passed to FDW. Does this really need to be passed to FDW? Seems 
> Stephen, Michael and I think that's necessary. But Kaigai-san does not. I 
> also think that TRUNCATE_REL_CONTEXT_CASCADING can be removed because there 
> seems no use case for that maybe.
>
> 2. Currently when the same foreign table is specified multiple times in the 
> command, the extra information only for the foreign table found first is 
> collected. For example, when "TRUNCATE ft, ONLY ft" is executed, 
> TRUNCATE_REL_CONTEXT_NORMAL is collected and _ONLY is ignored because "ft" is 
> found first. Is this OK? Or we should collect all, e.g., both _NORMAL and 
> _ONLY should be collected in that example? I think that the current approach 
> (i.e., collect the extra info about table found first if the same table is 
> specified multiple times) is good because even local tables are also treated 
> the same way. But Kaigai-san does not.
>
> 3. Currently postgres_fdw specifies ONLY clause in TRUNCATE command that it 
> constructs. That is, if the foreign table is specified with ONLY, 
> postgres_fdw also issues the TRUNCATE command for the corresponding remote 
> table with ONLY to the remote server. Then only root table is truncated in 
> remote server side, and the tables inheriting that are not truncated. Is this 
> behavior desirable? Seems Michael and I think this behavior is OK. But 
> Kaigai-san does not.
>
Prior to the discussion of 1-3, I like to clarify the role of foreign-tables.
(Likely, it will lead a natural conclusion for the above open items.)

As literal of SQL/MED (Management of External Data), a foreign table
is a representation of external data in PostgreSQL.
It allows to read and (optionally) write the external data wrapped by
FDW drivers, as if we usually read / write heap tables.
By the FDW-APIs, the core PostgreSQL does not care about the
structure, location, volume and other characteristics of
the external data itself. It expects FDW-APIs invocation will perform
as if we access a regular heap table.

On the other hands, we can say local tables are representation of
"internal" data in PostgreSQL.
A heap table is consists of one or more files (per BLCKSZ *
RELSEG_SIZE), and table-am intermediates
the on-disk data to/from on-memory structure (TupleTableSlot).
Here are no big differences in the concept. Ok?

As you know, ONLY clause controls whether TRUNCATE command shall run
on child-tables also, not only the parent.
If "ONLY parent_table" is given, its child tables are not picked up by
ExecuteTruncate(), unless child tables are not
listed up individually.
Then, once ExecuteTruncate() picked up the relations, it makes the
relations empty using table-am
(relation_set_new_filenode), and the callee
(heapam_relation_set_new_filenode) does not care about whether the
table is specified with ONLY, or not. It just makes the data
represented by the table empty (in transactional way).

So, how foreign tables shall perform?

Once ExecuteTruncate() picked up a foreign table, according to
ONLY-clause, does FDW driver shall consider
the context where the foreign tables are specified? And, what behavior
is consistent?
I think that FDW driver shall make the external data represented by
the foreign table empty, regardless of the
structure, location, volume and others.

Therefore, if we follow the above assumption, we don't need to inform
the context where foreign-tables are
picked up (TRUNCATE_REL_CONTEXT_*), so postgres_fdw shall not control
the remote TRUNCATE query
according to the flags. It always truncate the entire tables (if
multiple) on behalf of the foreign tables.

As an aside, if postgres_fdw maps are remote table with "ONLY" clause,
it is exactly a situation where we add
"ONLY" clause on the truncate command, because it is a representation
of the remote "ONLY parent_table" in
this case.

How about your thought?
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 




Re: psql - add SHOW_ALL_RESULTS option

2021-04-08 Thread Bharath Rupireddy
On Fri, Apr 9, 2021 at 6:36 AM Michael Paquier  wrote:
>
> On Fri, Apr 09, 2021 at 01:11:35AM +0800, Julien Rouhaud wrote:
> > On Thu, Apr 08, 2021 at 07:04:01PM +0200, Fabien COELHO wrote:
> >> It is definitely a open item. I'm not sure where you want to add it…
> >> possibly the "Pg 14 Open Items" wiki page?
> >
> > Correct.
>
> I was running a long query this morning and wondered why the
> cancellation was suddenly broken.  So I am not alone, and here you are
> with already a solution :)
>
> So, studying through 3a51306, this stuff has changed the query
> execution from a sync PQexec() to an async PQsendQuery().  And the
> proposed fix changes back to the behavior where the cancellation
> reset happens after getting a result, as there is no need to cancel
> anything.
>
> No strong objections from here if the consensus is to make
> SendQueryAndProcessResults() handle the cancel reset properly though I
> am not sure if this is the cleanest way to do things, but let's make
> at least the whole business consistent in the code for all those code
> paths.  For example, PSQLexecWatch() does an extra ResetCancelConn()
> that would be useless once we are done with
> SendQueryAndProcessResults().  Also, I can see that
> SendQueryAndProcessResults() would not issue a cancel reset if the
> query fails, for \watch when cancel is pressed, and for \watch with
> COPY.  So, my opinion here would be to keep ResetCancelConn() within
> PSQLexecWatch(), just add an extra one in SendQuery() to make all the
> three code paths printing results consistent, and leave
> SendQueryAndProcessResults() out of the cancellation logic.

Hi, I'm also facing the query cancellation issue, I have to kill the
backend everytime to cancel a query, it's becoming difficult.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] force_parallel_mode and GUC categories

2021-04-08 Thread Justin Pryzby
On Fri, Apr 09, 2021 at 10:50:53AM +0900, Michael Paquier wrote:
> On Sat, Apr 03, 2021 at 08:25:46PM -0500, Justin Pryzby wrote:
> > Forking this thread
> > https://www.postgresql.org/message-id/20210403154336.GG29125%40momjian.us
> 
> Didn't see this one, thanks for forking.
> 
> -   {"force_parallel_mode", PGC_USERSET, QUERY_TUNING_OTHER,
> +   {"force_parallel_mode", PGC_USERSET, DEVELOPER_OPTIONS,
> And not this one either, as it is mainly a planner thing, like the
> other parameters in the same area.

This is the main motive behind the patch.

Developer options aren't shown in postgresql.conf.sample, which it seems like
sometimes people read through quickly, setting a whole bunch of options that
sound good, sometimes including this one.  And in the best case they then ask
on -performance why their queries are slow and we tell them to turn it back off
to fix their issues.  This changes to no longer put it in .sample, and calling
it a "dev" option seems to be the classification and mechanism by which to do
that.

-- 
Justin

ps, Maybe you saw that I'd already resent without including the accidental junk
hunks.




Re: Why is Query NOT getting cancelled with SIGINT in PG14?

2021-04-08 Thread Bharath Rupireddy
On Fri, Apr 9, 2021 at 8:38 AM Justin Pryzby  wrote:
>
> On Fri, Apr 09, 2021 at 08:24:51AM +0530, Bharath Rupireddy wrote:
> > Looks like the running query is not getting cancelled even though I
> > issue CTRL+C from psql or kill the backend with SIGINT. This only
> > happens with PG14 not in PG13. Am I missing something here? Is it a
> > bug?
>
> Yes, see here:
> https://www.postgresql.org/message-id/flat/OSZPR01MB631017521EE6887ADC9492E8FD759%40OSZPR01MB6310.jpnprd01.prod.outlook.com#e9228ef1ae32315f8b0df3fa67a32e06

Thanks. I missed to follow that thread. I will respond there.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Why is Query NOT getting cancelled with SIGINT in PG14?

2021-04-08 Thread Justin Pryzby
On Fri, Apr 09, 2021 at 08:24:51AM +0530, Bharath Rupireddy wrote:
> Looks like the running query is not getting cancelled even though I
> issue CTRL+C from psql or kill the backend with SIGINT. This only
> happens with PG14 not in PG13. Am I missing something here? Is it a
> bug?

Yes, see here:
https://www.postgresql.org/message-id/flat/OSZPR01MB631017521EE6887ADC9492E8FD759%40OSZPR01MB6310.jpnprd01.prod.outlook.com#e9228ef1ae32315f8b0df3fa67a32e06

-- 
Justin




Re: missing documentation for streaming in-progress transactions

2021-04-08 Thread Ajin Cherian
On Fri, Apr 9, 2021 at 10:23 AM Euler Taveira  wrote:

>
> I didn't like this style because it is not descriptive enough. It is also
> not a
> style adopted by Postgres. I suggest to add something like "This field was
> introduced in version 2" or "This field is available since version 2"
> after the
> field description.
>

I have updated this to  "Since protocol version 2"

>
> +Xid of the sub-transaction (will be same as xid of the
> transaction for top-level
> +transactions).
> +
>
> Although, sub-transaction is also used in the documentation, I suggest to
> use
> subtransaction. Maybe change the other sub-transaction occurrences too.
>

Updated.

regards,
Ajin Cherian
Fujitsu Australia


v4-0001-doc-Update-information-of-new-messages-for-logica.patch
Description: Binary data


Why is Query NOT getting cancelled with SIGINT in PG14?

2021-04-08 Thread Bharath Rupireddy
Hi,

Looks like the running query is not getting cancelled even though I
issue CTRL+C from psql or kill the backend with SIGINT. This only
happens with PG14 not in PG13. Am I missing something here? Is it a
bug?

create table t1(a1 int);
insert into t1 select * from generate_series(1,100);  --> I
chose an intentionally long running query, now either issue CTRL+C or
kill the backend with SIGINT, the query doesn't get cancelled. Note
that I don't even see "Cancel request sent" message on psql when I
issue CTRL+C.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: test runner (was Re: SQL-standard function body)

2021-04-08 Thread Andres Freund
Hi,

On 2021-04-09 08:39:46 +0900, Michael Paquier wrote:
> On Thu, Apr 08, 2021 at 10:50:39AM -0700, Andres Freund wrote:
> > Obviously all very far from being ready, but this seemed like a good
> > enough excuse to mention it ;)
> 
> This is nice.  Are there any parallelism capabilities?

Yes. It defaults to number-of-cores processes, but obviously can also be
specified explicitly. One very nice part about it is that it'd work
largely the same on windows (which has practically unusable testing
right now). It probably doesn't yet, because I just tried to get it
build and run tests at all, but it shouldn't be a lot of additional
work.

Greetings,

Andres Freund




Re: Simplify backend terminate and wait logic in postgres_fdw test

2021-04-08 Thread Kyotaro Horiguchi
At Fri, 9 Apr 2021 10:59:44 +0900, Michael Paquier  wrote 
in 
> On Fri, Apr 09, 2021 at 06:53:21AM +0530, Bharath Rupireddy wrote:
> > I didn't think of the warning cases, my bad. How about using SET
> > client_min_messages = 'ERROR'; before we call
> > pg_wait_for_backend_termination? We can only depend on the return
> > value of pg_wait_for_backend_termination, when true we  can exit. This
> > way the buildfarm will not see warnings. Thoughts?
> 
> You could do that, but I would also bet that this is going to get
> forgotten in the future if this gets extended in more SQL tests that
> are output-sensitive, in or out of core.  Honestly, I can get behind a
> warning in pg_wait_for_backend_termination() to inform that the
> process poked at is not a PostgreSQL one, because it offers new and
> useful information to the user.  But, and my apologies for sounding a
> bit noisy, I really don't get why pg_wait_until_termination() has any
> need to do that.  From what I can see, it provides the following
> information:
> - A PID, that we already know from the caller or just from
> pg_stat_activity.
> - A timeout, already known as well.
> - The fact that the process did not terminate, information given by
> the "false" status, only used in this case.
> 
> So there is no new information here to the user, only a duplicate of
> what's already known to the caller of this function.  I see more
> advantages in removing this WARNING rather than keeping it.

FWIW I agree to Michael. I faintly remember that I thought the same
while reviewing but it seems that I forgot to write a comment like
that. It's a work of the caller, concretely the existing callers and
any possible script that calls the function.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: TRUNCATE on foreign table

2021-04-08 Thread Zhihong Yu
On Thu, Apr 8, 2021 at 6:47 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Thu, Apr 8, 2021 at 6:44 PM Fujii Masao 
> wrote:
> > The followings are the open items and discussion points that I'm
> thinking of.
> >
> > 1. Currently the extra information (TRUNCATE_REL_CONTEXT_NORMAL,
> TRUNCATE_REL_CONTEXT_ONLY or TRUNCATE_REL_CONTEXT_CASCADING) about how a
> foreign table was specified as the target to truncate in TRUNCATE command
> is collected and passed to FDW. Does this really need to be passed to FDW?
> Seems Stephen, Michael and I think that's necessary. But Kaigai-san does
> not. I also think that TRUNCATE_REL_CONTEXT_CASCADING can be removed
> because there seems no use case for that maybe.
>
> I think we should remove the unused enums/macros, instead we could
> mention a note of the extensibility of those enums/macros in the
> comments section around the enum/macro definitions.
>
> > 2. Currently when the same foreign table is specified multiple times in
> the command, the extra information only for the foreign table found first
> is collected. For example, when "TRUNCATE ft, ONLY ft" is executed,
> TRUNCATE_REL_CONTEXT_NORMAL is collected and _ONLY is ignored because "ft"
> is found first. Is this OK? Or we should collect all, e.g., both _NORMAL
> and _ONLY should be collected in that example? I think that the current
> approach (i.e., collect the extra info about table found first if the same
> table is specified multiple times) is good because even local tables are
> also treated the same way. But Kaigai-san does not.
>
> IMO, the foreign truncate command should be constructed by collecting
> all the information i.e. "TRUNCATE ft, ONLY ft" and let the remote
> server execute how it wants to execute. That will be consistent and no
> extra logic is required to track the already seen foreign tables while
> foreign table collection/foreign truncate command is being prepared on
> the local server.
>
> I was thinking that the postgres throws error or warning for commands
> such as truncate, vaccum, analyze when the same tables are specified,
> but seems like that's not what it does.
>
> > 3. Currently postgres_fdw specifies ONLY clause in TRUNCATE command that
> it constructs. That is, if the foreign table is specified with ONLY,
> postgres_fdw also issues the TRUNCATE command for the corresponding remote
> table with ONLY to the remote server. Then only root table is truncated in
> remote server side, and the tables inheriting that are not truncated. Is
> this behavior desirable? Seems Michael and I think this behavior is OK. But
> Kaigai-san does not.
>
> I'm okay with the behaviour as it is consistent with what ONLY does to
> local tables. Documenting this behaviour(if not done already) is a
> better way I think.
>
> > 4. Tab-completion for TRUNCATE should be updated so that also foreign
> tables are displayed.
>
> It will be good to have.
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com


w.r.t. point #1:
bq. I think we should remove the unused enums/macros,

I agree. When there is more concrete use case which requires new enum, we
can add enum whose meaning would be clearer.

Cheers


Re: Simplify backend terminate and wait logic in postgres_fdw test

2021-04-08 Thread Michael Paquier
On Fri, Apr 09, 2021 at 06:53:21AM +0530, Bharath Rupireddy wrote:
> I didn't think of the warning cases, my bad. How about using SET
> client_min_messages = 'ERROR'; before we call
> pg_wait_for_backend_termination? We can only depend on the return
> value of pg_wait_for_backend_termination, when true we  can exit. This
> way the buildfarm will not see warnings. Thoughts?

You could do that, but I would also bet that this is going to get
forgotten in the future if this gets extended in more SQL tests that
are output-sensitive, in or out of core.  Honestly, I can get behind a
warning in pg_wait_for_backend_termination() to inform that the
process poked at is not a PostgreSQL one, because it offers new and
useful information to the user.  But, and my apologies for sounding a
bit noisy, I really don't get why pg_wait_until_termination() has any
need to do that.  From what I can see, it provides the following
information:
- A PID, that we already know from the caller or just from
pg_stat_activity.
- A timeout, already known as well.
- The fact that the process did not terminate, information given by
the "false" status, only used in this case.

So there is no new information here to the user, only a duplicate of
what's already known to the caller of this function.  I see more
advantages in removing this WARNING rather than keeping it.
--
Michael


signature.asc
Description: PGP signature


Re: Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread Michael Paquier
On Thu, Apr 08, 2021 at 09:21:30PM -0400, Alvaro Herrera wrote:
> On 2021-Apr-08, Tom Lane wrote:
>> Maybe like
>> 
>> case T_ScalarArrayOpExpr:
>> /* ScalarArrayOpExpr's result is boolean ... */
>> coll = InvalidOid;  /* ... so it has no collation */
>> break;
> 
> This is much clearer, yeah.

+1.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] force_parallel_mode and GUC categories

2021-04-08 Thread Michael Paquier
On Sat, Apr 03, 2021 at 08:25:46PM -0500, Justin Pryzby wrote:
> Forking this thread
> https://www.postgresql.org/message-id/20210403154336.GG29125%40momjian.us

Didn't see this one, thanks for forking.

> I understood "developer" to mean someone who's debugging postgres itself, not
> (say) a function written using pl/pgsql.  Like backtrace_functions,
> post_auth_delay, jit_profiling_support.
> 
> But I see that some "dev" options are more user-facing (for a sufficiently
> advanced user):
> ignore_checksum_failure, ignore_invalid_pages, zero_damaged_pages.
> 
> Also, I understood this to mean the "category" in pg_settings, but I guess
> what's important here is the absense of the GUC in the sample/template config
> file.  pg_settings.category and the sample headings it appears are intended to
> be synchronized, but a few of them are out of sync.  See attached.
> 
> +1 to move this to "developer" options and remove it from the sample config:
> 
> # - Other Planner Options -
> #force_parallel_mode = off

0001 has some changes to pg_config_manual.h related to valgrind and
memory randomization.  You may want to remove that before posting a
patch.

-   {"track_commit_timestamp", PGC_POSTMASTER, REPLICATION,
+   {"track_commit_timestamp", PGC_POSTMASTER, REPLICATION_SENDING,
I can get behind this change for clarity where it gets actively used.

-   {"track_activity_query_size", PGC_POSTMASTER, RESOURCES_MEM,
+   {"track_activity_query_size", PGC_POSTMASTER, STATS_COLLECTOR,
But not this one, because it is a memory setting.

-   {"force_parallel_mode", PGC_USERSET, QUERY_TUNING_OTHER,
+   {"force_parallel_mode", PGC_USERSET, DEVELOPER_OPTIONS,
And not this one either, as it is mainly a planner thing, like the
other parameters in the same area.

The last change is related to log_autovacuum_min_duration, and I can
get behind the argument you are making to group all log activity
parameters together.  Now, about this part:
+#log_autovacuum_min_duration = -1  # -1 disables, 0 logs all actions and
+   # their durations, > 0 logs only
+   # actions running at least this number
+   # of milliseconds.
I think that we should clarify in the description that this is an
autovacuum-only thing, say by appending a small sentence about the
fact that it logs autovacuum activities, in a similar fashion to
log_temp_files.  Moving the parameter out of the autovacuum section
makes it lose a bit of context.

@@ -6903,6 +6903,7 @@ fetch_more_data_begin(AsyncRequest *areq)
charsql[64];

Assert(!fsstate->conn_state->pendingAreq);
+   Assert(fsstate->conn);
What's this diff doing here? 
--
Michaelx


signature.asc
Description: PGP signature


Re: TRUNCATE on foreign table

2021-04-08 Thread Bharath Rupireddy
On Thu, Apr 8, 2021 at 6:44 PM Fujii Masao  wrote:
> The followings are the open items and discussion points that I'm thinking of.
>
> 1. Currently the extra information (TRUNCATE_REL_CONTEXT_NORMAL, 
> TRUNCATE_REL_CONTEXT_ONLY or TRUNCATE_REL_CONTEXT_CASCADING) about how a 
> foreign table was specified as the target to truncate in TRUNCATE command is 
> collected and passed to FDW. Does this really need to be passed to FDW? Seems 
> Stephen, Michael and I think that's necessary. But Kaigai-san does not. I 
> also think that TRUNCATE_REL_CONTEXT_CASCADING can be removed because there 
> seems no use case for that maybe.

I think we should remove the unused enums/macros, instead we could
mention a note of the extensibility of those enums/macros in the
comments section around the enum/macro definitions.

> 2. Currently when the same foreign table is specified multiple times in the 
> command, the extra information only for the foreign table found first is 
> collected. For example, when "TRUNCATE ft, ONLY ft" is executed, 
> TRUNCATE_REL_CONTEXT_NORMAL is collected and _ONLY is ignored because "ft" is 
> found first. Is this OK? Or we should collect all, e.g., both _NORMAL and 
> _ONLY should be collected in that example? I think that the current approach 
> (i.e., collect the extra info about table found first if the same table is 
> specified multiple times) is good because even local tables are also treated 
> the same way. But Kaigai-san does not.

IMO, the foreign truncate command should be constructed by collecting
all the information i.e. "TRUNCATE ft, ONLY ft" and let the remote
server execute how it wants to execute. That will be consistent and no
extra logic is required to track the already seen foreign tables while
foreign table collection/foreign truncate command is being prepared on
the local server.

I was thinking that the postgres throws error or warning for commands
such as truncate, vaccum, analyze when the same tables are specified,
but seems like that's not what it does.

> 3. Currently postgres_fdw specifies ONLY clause in TRUNCATE command that it 
> constructs. That is, if the foreign table is specified with ONLY, 
> postgres_fdw also issues the TRUNCATE command for the corresponding remote 
> table with ONLY to the remote server. Then only root table is truncated in 
> remote server side, and the tables inheriting that are not truncated. Is this 
> behavior desirable? Seems Michael and I think this behavior is OK. But 
> Kaigai-san does not.

I'm okay with the behaviour as it is consistent with what ONLY does to
local tables. Documenting this behaviour(if not done already) is a
better way I think.

> 4. Tab-completion for TRUNCATE should be updated so that also foreign tables 
> are displayed.

It will be good to have.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: [POC] Fast COPY FROM command for the table with foreign partitions

2021-04-08 Thread Zhihong Yu
On Thu, Apr 8, 2021 at 5:49 PM Justin Pryzby  wrote:

> I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.
>
> --
> Justin
>

Hi,
In src/backend/commands/copyfrom.c :

+   if (resultRelInfo->ri_RelationDesc->rd_rel->relkind ==
RELKIND_FOREIGN_TABLE)

There are a few steps of indirection. Adding assertion before the if
statement on resultRelInfo->ri_RelationDesc, etc would help catch potential
invalid pointer.

+CopyToStart(CopyToState cstate)
...
+CopyToFinish(CopyToState cstate)

Since 'copy to' is the action, it would be easier to read the method names
if they're called StartCopyTo, FinishCopyTo, respectively.
That way, the method names would be consistent with existing ones, such as:
 extern uint64 DoCopyTo(CopyToState cstate);

+* If a partition's root parent isn't allowed to use it, neither is the

In the above sentence, 'it' refers to multi insert. It would be more
readable to explicitly mention 'multi insert' instead of 'it'

Cheers


RE: 2019-03 CF now in progress

2021-04-08 Thread tsunakawa.ta...@fujitsu.com
From: David Steele 
> Overall, 118 of 262 entries were closed during this commitfest (45%).
> That includes 91 patches committed since March 1, which is pretty
> fantastic. Thank you to everyone, especially the committers, for your
> hard work!

The number of committed patches in the last CF is record-breaking in recent 
years:

PG 14: 124
PG 13: 90
PG 12: 100
PG 11: 117
PG 10: 116
PG 9.6: 74
PG 9.5: 102

I take off my hat to the hard work of committers and CFM.  OTOH, there seems to 
be many useful-looking features pending as ready for committer.  I look forward 
to seeing them committed early in PG 15.


Regards
Takayuki Tsunakawa




Re: Simplify backend terminate and wait logic in postgres_fdw test

2021-04-08 Thread Bharath Rupireddy
On Fri, Apr 9, 2021 at 5:51 AM Michael Paquier  wrote:
>
> On Thu, Apr 08, 2021 at 06:27:56PM +0530, Bharath Rupireddy wrote:
> > Agree. Please see the attached patch, I removed a fixed waiting time.
> > Instead of relying on pg_stat_activity, pg_sleep and
> > pg_stat_clear_snapshot, now it depends on pg_terminate_backend and
> > pg_wait_for_backend_termination. This way we could reduce the
> > functions that the procedure terminate_backend_and_wait uses and also
> > the new functions pg_terminate_backend and
> > pg_wait_for_backend_termination gets a test coverage.
>
> +   EXIT WHEN is_terminated;
> +   SELECT * INTO is_terminated FROM 
> pg_wait_for_backend_termination(pid_v, 1000);
> This is still a regression if the termination takes more than 1s,
> no?  In such a case terminate_backend_and_wait() would issue a WARNING
> and pollute the regression test output.  I can see the point of what
> you are achieving here, and that's a good idea, but from the point of
> view of the buildfarm the WARNING introduced by aaf0432 is a no-go.

I didn't think of the warning cases, my bad. How about using SET
client_min_messages = 'ERROR'; before we call
pg_wait_for_backend_termination? We can only depend on the return
value of pg_wait_for_backend_termination, when true we  can exit. This
way the buildfarm will not see warnings. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Tom Lane wrote:

> Maybe like
> 
> case T_ScalarArrayOpExpr:
> /* ScalarArrayOpExpr's result is boolean ... */
> coll = InvalidOid;  /* ... so it has no collation */
> break;

This is much clearer, yeah.

-- 
Álvaro Herrera   Valdivia, Chile




Re: 2019-03 CF now in progress

2021-04-08 Thread Michael Paquier
On Thu, Apr 08, 2021 at 09:49:20AM -0400, David Steele wrote:
> Overall, 118 of 262 entries were closed during this commitfest (45%). That
> includes 91 patches committed since March 1, which is pretty fantastic.
> Thank you to everyone, especially the committers, for your hard work!

I think that the biggest thanks here goes to you, for cleaning the CF
entries completely.  So, thanks!
--
Michael


signature.asc
Description: PGP signature


Re: psql - add SHOW_ALL_RESULTS option

2021-04-08 Thread Michael Paquier
On Fri, Apr 09, 2021 at 01:11:35AM +0800, Julien Rouhaud wrote:
> On Thu, Apr 08, 2021 at 07:04:01PM +0200, Fabien COELHO wrote:
>> It is definitely a open item. I'm not sure where you want to add it…
>> possibly the "Pg 14 Open Items" wiki page?
> 
> Correct.

I was running a long query this morning and wondered why the
cancellation was suddenly broken.  So I am not alone, and here you are
with already a solution :)

So, studying through 3a51306, this stuff has changed the query
execution from a sync PQexec() to an async PQsendQuery().  And the
proposed fix changes back to the behavior where the cancellation
reset happens after getting a result, as there is no need to cancel
anything.

No strong objections from here if the consensus is to make
SendQueryAndProcessResults() handle the cancel reset properly though I
am not sure if this is the cleanest way to do things, but let's make
at least the whole business consistent in the code for all those code
paths.  For example, PSQLexecWatch() does an extra ResetCancelConn()
that would be useless once we are done with
SendQueryAndProcessResults().  Also, I can see that
SendQueryAndProcessResults() would not issue a cancel reset if the
query fails, for \watch when cancel is pressed, and for \watch with
COPY.  So, my opinion here would be to keep ResetCancelConn() within
PSQLexecWatch(), just add an extra one in SendQuery() to make all the
three code paths printing results consistent, and leave
SendQueryAndProcessResults() out of the cancellation logic.

>> I tried but I do not have enough
>> privileges, if you can do it please proceed. I added an entry in the next CF
>> in the bugfix section.
> 
> That's strange, I don't think you need special permission there.  It's
> working for me so I added an item with a link to the patch!

As long as you have a community account, you should have the
possibility to edit the page.  So if you feel that any change is
required, please feel free to do so, of course.
--
Michael


signature.asc
Description: PGP signature


Re: [POC] Fast COPY FROM command for the table with foreign partitions

2021-04-08 Thread Justin Pryzby
I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.

-- 
Justin
>From 0987ca4f62fb8c9b43a3fe142d955d8a9cb6f36f Mon Sep 17 00:00:00 2001
From: Takayuki Tsunakawa 
Date: Tue, 9 Feb 2021 12:50:00 +0900
Subject: [PATCH] Fast COPY FROM into the foreign or sharded table.

This feature enables bulk COPY into foreign table when multi-insert
is possible and foreign table has non-zero number of columns.

The following routines are added to the FDW interface:
* BeginForeignCopy
* ExecForeignCopy
* EndForeignCopy

BeginForeignCopy and EndForeignCopy initialize and free
the CopyState of bulk COPY. The ExecForeignCopy routine runs
'COPY ... FROM STDIN' command to the foreign server, in an iterative
manner to send tuples using the CopyTo() machinery.

Code that constructs a list of columns for a given foreign relation
in the deparseAnalyzeSql() routine is split into deparseRelColumnList().
It is reused in deparseCopyFromSql().

Added TAP-tests on the specific corner cases of COPY FROM STDIN operation.

By the analogy of CopyFrom() the CopyState structure was extended
with data_dest_cb callback. It is used to send the text representation
of a tuple to a custom destination.
The PgFdwModifyState structure is extended with the cstate field.
It is needed for avoid repeated initialization of CopyState. Also for this
reason CopyTo() routine is split into the set of routines CopyToStart()/
CopyTo()/CopyToFinish().

When 0d5f05cde introduced support for using multi-insert mode when
copying into partitioned tables, it introduced single variable of
enum type CopyInsertMethod shared across all potential target
relations (partitions) that, along with some target relation
properties, dictated whether to engage multi-insert mode for a given
target relation.

Change that decision logic to the combination of ExecMultiInsertAllowed()
and its caller. The former encapsulates the common criteria to allow
multi-insert. The latter uses additional criteria and sets the new
boolean field ri_usesMultiInsert of ResultRelInfo.
That prevents repeated computation of the same information in some cases,
especially for partitions, and the new arrangement results in slightly
more readability.
Enum CopyInsertMethod is removed.

Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote, Takayuki Tsunakawa
Reviewed-by: Ashutosh Bapat, Amit Langote, Takayuki Tsunakawa
Discussion:
https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru
---
 contrib/postgres_fdw/deparse.c|  63 +++-
 .../postgres_fdw/expected/postgres_fdw.out|  46 ++-
 contrib/postgres_fdw/postgres_fdw.c   | 141 +
 contrib/postgres_fdw/postgres_fdw.h   |   1 +
 contrib/postgres_fdw/sql/postgres_fdw.sql |  45 +++
 doc/src/sgml/fdwhandler.sgml  |  71 -
 src/backend/commands/copy.c   |   2 +-
 src/backend/commands/copyfrom.c   | 271 --
 src/backend/commands/copyto.c |  88 --
 src/backend/executor/execMain.c   |  44 +++
 src/backend/executor/execPartition.c  |  37 ++-
 src/include/commands/copy.h   |   5 +
 src/include/commands/copyfrom_internal.h  |  10 -
 src/include/executor/executor.h   |   1 +
 src/include/foreign/fdwapi.h  |  15 +
 src/include/nodes/execnodes.h |   8 +-
 16 files changed, 637 insertions(+), 211 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index bdc4c3620d..bf93c1d091 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -185,6 +185,8 @@ static void appendAggOrderBy(List *orderList, List *targetList,
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
 static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno,
 	deparse_expr_cxt *context);
+static List *deparseRelColumnList(StringInfo buf, Relation rel,
+  bool enclose_in_parens);
 
 /*
  * Helper functions
@@ -1859,6 +1861,23 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
 		 withCheckOptionList, returningList, retrieved_attrs);
 }
 
+/*
+ * Deparse remote COPY FROM statement
+ *
+ * Note that this explicitly specifies the list of COPY's target columns
+ * to account for the fact that the remote table's columns may not match
+ * exactly with the columns declared in the local definition.
+ */
+void
+deparseCopyFromSql(StringInfo buf, Relation rel)
+{
+	appendStringInfoString(buf, "COPY ");
+	deparseRelation(buf, rel);
+	(void) deparseRelColumnList(buf, rel, true);
+
+	appendStringInfoString(buf, " FROM STDIN ");
+}
+
 /*
  * deparse remote UPDATE statement
  *
@@ -2120,6 +2139,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel)
  */
 void
 deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs)
+{
+	appendStringInfoString(buf, "SELECT ");
+	*retrieved_attrs = deparseRelColumnList(buf, rel, 

Re: Remove page-read callback from XLogReaderState.

2021-04-08 Thread Kyotaro Horiguchi
At Thu, 8 Apr 2021 23:51:34 +1200, Thomas Munro  wrote 
in 
> On Thu, Apr 8, 2021 at 9:46 PM Thomas Munro  wrote:
> > I squashed the patch set into one because half of them were fixups,
> > and the two main patches were really parts of the same change and
> > should go in together.
> >
> > I fixed a few compiler warnings (GCC 10.2 reported several
> > uninitialised variables, comparisons that are always true, etc) and
> > some comments.  You can see these in the fixup patch.
> 
> Pushed.  Luckily there are plenty more improvements possible for
> XLogReader/XLogDecoder in the next cycle.

I'm surprised to see this pushed this soon.  Thanks for pushing this!

And thanks for fixing the remaining mistakes including some stupid
ones..

At Thu, 8 Apr 2021 10:04:26 +1200, Thomas Munro  wrote 
in 
> There is a stray elog(HOGE) :-)

U!  This looks like getting slipped-in while investigating
another issue..  Thanks for preventing the repository from being
contaminated by such a thing..

At Thu, 8 Apr 2021 21:46:06 +1200, Thomas Munro  wrote 
in 
> I think maybe it it should really be XLogReaderSetInputData(state,
> tli, data, size) in a later release.  In the meantime, I changed it to
> XLogReaderSetInputData(state, size), hope that name make sense...

Sounds better. I didn't like that page-readers are allowed to touch
XLogReaderStats.seg directly. Anyway it would be a small change.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: missing documentation for streaming in-progress transactions

2021-04-08 Thread Euler Taveira
On Thu, Apr 8, 2021, at 4:25 AM, Ajin Cherian wrote:
> Updated.

-   Protocol version. Currently only version 1 is
-   supported.
-  
+   Protocol version. Currently versions 1 and
+   2 are supported. The version 2
+   is supported only for server versions 14 and above, and is used to allow
+   streaming of large in-progress transactions.
+ 

s/server versions/server version/

I suggest that the last part of the sentence might be "and it allows streaming
of large in-progress transactions"

+  Since: 2
+
+

I didn't like this style because it is not descriptive enough. It is also not a
style adopted by Postgres. I suggest to add something like "This field was
introduced in version 2" or "This field is available since version 2" after the
field description.

+Xid of the sub-transaction (will be same as xid of the 
transaction for top-level
+transactions).
+

Although, sub-transaction is also used in the documentation, I suggest to use
subtransaction. Maybe change the other sub-transaction occurrences too.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


RE: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2021-04-08 Thread kuroda.hay...@fujitsu.com
Dear Andrei,

> I think, yes, version of pg_stat_statements is need to be updated. Is
> it will be 1.10 in PG15?

I think you are right. 
According to [1] we can bump up the version per one PG major version,
and any features are not committed yet for 15.

[1]: https://www.postgresql.org/message-id/20201202040516.GA43757@nol


Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: Simplify backend terminate and wait logic in postgres_fdw test

2021-04-08 Thread Michael Paquier
On Thu, Apr 08, 2021 at 06:27:56PM +0530, Bharath Rupireddy wrote:
> Agree. Please see the attached patch, I removed a fixed waiting time.
> Instead of relying on pg_stat_activity, pg_sleep and
> pg_stat_clear_snapshot, now it depends on pg_terminate_backend and
> pg_wait_for_backend_termination. This way we could reduce the
> functions that the procedure terminate_backend_and_wait uses and also
> the new functions pg_terminate_backend and
> pg_wait_for_backend_termination gets a test coverage.

+   EXIT WHEN is_terminated;
+   SELECT * INTO is_terminated FROM pg_wait_for_backend_termination(pid_v, 
1000);
This is still a regression if the termination takes more than 1s,
no?  In such a case terminate_backend_and_wait() would issue a WARNING
and pollute the regression test output.  I can see the point of what
you are achieving here, and that's a good idea, but from the point of
view of the buildfarm the WARNING introduced by aaf0432 is a no-go.  I
honestly don't quite get the benefit in issuing a WARNING in this case
anyway, as the code already returns false on timeout so as caller
would know the status of the operation.
--
Michael


signature.asc
Description: PGP signature


Re: Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread Tom Lane
David Rowley  writes:
> hmm ok.  I imagine there must be a better way to say that then since
> it confused at least 1 reader so far.  My problem is that I assumed
> "result" meant the result of the function that the comment is written
> in, not the result of evaluating the given expression during
> execution. If that was more clear then I'd not have been misled.

Maybe like

case T_ScalarArrayOpExpr:
/* ScalarArrayOpExpr's result is boolean ... */
coll = InvalidOid;  /* ... so it has no collation */
break;

?

regards, tom lane




Re: SQL:2011 PERIODS vs Postgres Ranges?

2021-04-08 Thread Paul A Jungwirth
On Thu, Apr 8, 2021 at 7:22 AM David Steele  wrote:
>
> Paul, you can submit to the next CF when you are ready with a new patch.

Thanks David! I've made a lot of progress but still need to finish
support for CASCADE on temporal foreign keys. I've been swamped with
other things, but hopefully I can get something during this current
CF.

Paul




Re: test runner (was Re: SQL-standard function body)

2021-04-08 Thread Michael Paquier
On Thu, Apr 08, 2021 at 10:50:39AM -0700, Andres Freund wrote:
> Obviously all very far from being ready, but this seemed like a good
> enough excuse to mention it ;)

This is nice.  Are there any parallelism capabilities?
--
Michael


signature.asc
Description: PGP signature


Re: Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread David Rowley
On Fri, 9 Apr 2021 at 10:11, Tom Lane  wrote:
>
> David Rowley  writes:
> > I noticed that nodeFuncs.c appears to have some pretty sloppy work
> > done in many of the comments.  Many look like they've just not been
> > updated from a copy/paste/edit from another node function.
> > The attached aims to clean these up.
>
> I believe every one of these changes is wrong.
> For instance:
>
> case T_ScalarArrayOpExpr:
> -   coll = InvalidOid;  /* result is always boolean */
> +   coll = InvalidOid;  /* result is always 
> InvalidOid */
> break;
>
> The point here is that the result type of ScalarArrayOpExpr is boolean,
> which has no collation, therefore reporting its collation as InvalidOid
> is correct.  Maybe there's a clearer way to say that, but your text is
> more confusing not less so.

hmm ok.  I imagine there must be a better way to say that then since
it confused at least 1 reader so far.  My problem is that I assumed
"result" meant the result of the function that the comment is written
in, not the result of evaluating the given expression during
execution. If that was more clear then I'd not have been misled.

David




Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2021-04-08 Thread David Rowley
On Fri, 9 Apr 2021 at 09:32, Tomas Vondra  wrote:
>
> I ran the same set of benchmarks on the v6, which I think should be
> mostly identical to what was committed. I extended the test a bit to
> test table with 0, 1, 5 and 1000 rows, and also with int and text
> values, to see how it works with more expensive comparators.
>
> I built binaries with gcc 9.2.0 and clang 11.0.0, the full results are
> attached. There's a bit of difference between gcc and clang, but the
> general behavior is about the same, so I'll only present gcc results to
> keep this simple. I'll only throughput comparison to master, so >1.0
> means good, <1.0 means bad. If you're interested in actual tps, see the
> full results.
>
> For the v5 patch (actually v4-0001 + v5-0002) and v6, the results are:
>
> integer column / v5
> ===
>
> rows  10/p100/p 16/p 10/s100/s 16/s
> ---
>0   97%  97%  97% 107% 126% 108%
>1   95%  82%  94% 108% 132% 110%
>5   95%  83%  95% 108% 132% 110%
> 1000  129% 481% 171% 131% 382% 165%

I think we should likely ignore the v5 patch now.  The reason is that
it was pretty much unfinished and there were many places that I'd not
yet added support for the HashedScalarArrayOpExpr node type yet.  This
could cause these nodes to be skipped during node mutations or node
walking which would certainly make planning faster, just not in a way
that's correct.

> integer column / v6
> ===
>
> rows  10/p100/p 16/p 10/s100/s 16/s
> ---
>0   97%  97%  97%  98%  98%  98%
>1   96%  84%  95%  97%  97%  98%
>5   97%  85%  96%  98%  97%  97%
> 1000  129% 489% 172% 128% 330% 162%

This is a really informative set of results. I can only guess that the
slowdown of the 100/prepared query is down to building the hash table.
I think that because the 0 rows test does not show the slowdown and we
only build the table when evaluating for the first time. There's a
slightly larger hit on 1 row vs 5 rows, which makes sense since the
rewards of the hash lookup start paying off more with more rows.

Looking at your tps numbers, I think I can see why we get the drop in
performance with prepared statements but not simple statements. This
seems to just be down to the fact that the planning time dominates in
the simple statement case.  For example, the "1 row" test for 100/s
for v6 is 10023.3 tps, whereas the 100/p result is 44093.8 tps. With
master, prepared gets 52400.0 tps. So we could say the hash table
build costs us 8306.2 tps, or 3.59 microseconds per execution, per:

postgres=# select 100 / 52400.0 - 100 / 44093.8;
  ?column?
-
 -3.5949559161508538
(1 row)

If we look at the tps for the simple query version of the same test.
Master did 10309.6 tps, v6 did 10023.3 tps. If we apply that 3.59
microsecond slowdown to master's tps, then we get pretty close to
within 1% of the v6 tps:

postgres=# select 100 / (100 / 10309.6 + 3.59);
   ?column?
---
 9941.6451581291294165
(1 row)

> text column / v6
> 
>
> rows  10/p100/p 16/p 10/s100/s 16/s
> ---
>0  101% 101% 101%  98%  99%  99%
>1   98%  82%  96%  98%  96%  97%
>5  100%  84%  98%  98%  96%  98%
> 1000  297%1645% 408% 255%1000% 336%
>
>
> Overall, the behavior for integer and text columns is the same, for both
> patches. There's a couple interesting observations:
>
> 1) For the "simple" query mode, v5 helped quite a bit (20-30% speedup),
> but v6 does not seem to help at all - it's either same or slower than
> unpatched master.

I think that's related to the fact that I didn't finish adding
HashedScalarArrayOpExpr processing to all places that needed it.

> I wonder why is that, and if we could get some of the speedup with v6?
> At first I thought that maybe v5 is not building the hash table in cases
> where v6 does, but that shouldn't be faster than master.

I don't think v5 and v6 really do anything much differently in the
executor. The only difference is really during ExecInitExprRec() when
we initialize the expression. With v5 we had a case
T_HashedScalarArrayOpExpr: to handle the new node type, but in v6 we
have if (OidIsValid(opexpr->hashfuncid)). Oh, wait. I did add the
missing permissions check on the hash function, so that will account
for something. As far as I can see, that's required.

> 2) For the "prepared" mode, there's a clear performance hit the 

Re: Typo in jsonfuncs.c

2021-04-08 Thread Tatsuro Yamada

Hi Julien and Amit Kapila,

On 2021/04/08 17:33, Julien Rouhaud wrote:

On Thu, Apr 08, 2021 at 10:06:56AM +0900, Tatsuro Yamada wrote:

Hi,

I found a typo in jsonfuncs.c, probably.
   s/an an/an/
Please find attached patch.


For the archives' sake, this has been pushed as of 8ffb003591.



Julien, thanks for the info! :-D
Also, thanks for taking your time to push this, Amit.
 
Regards,

Tatsuro Yamada





Re: weird interaction between asynchronous queries and pg_sleep

2021-04-08 Thread Merlin Moncure
On Thu, Apr 8, 2021 at 1:05 PM Merlin Moncure  wrote:
> This effect is only noticeable when the remote query is returning
> volumes of data.  My question is, is there any way to sleep loop
> client side without giving up 3x performance penalty?  Why is that
> that when more local sleep queries are executed, performance improves?


Looking at this more, it looks like that when sleeping with pg_sleep,
libpq does not receive the data.  I think for this type of pattern to
work correctly, dblink would need a custom sleep function wrapping
poll (or epoll) that consumes input on the socket when signalled read
ready.

merlin




Re: SQL-standard function body

2021-04-08 Thread Michael Paquier
On Thu, Apr 08, 2021 at 12:21:05PM -0400, Tom Lane wrote:
> I see that contrib/test_decoding also sets NO_INSTALLCHECK = 1,
> and the reason it gets tested is that the buildfarm script has
> a special module for that.  I guess we need to clone that module,
> or maybe better, find a way to generalize it.
> 
> There are also some src/test/modules modules that set NO_INSTALLCHECK,
> but apparently those do have coverage (modules-check is the step that
> runs their SQL tests, and then the TAP tests if any get broken out
> as separate buildfarm steps).

FWIW, on Windows any module with NO_INSTALLCHECK does not get tested
as we rely mostly on an installed server to do all the tests and avoid
the performance impact of setting up a new server for each module's
test.
--
Michael


signature.asc
Description: PGP signature


Re: pg_amcheck contrib application

2021-04-08 Thread Robert Haas
On Thu, Apr 8, 2021 at 6:51 PM Mark Dilger  wrote:
> #2 is if chunk_seq goes up but skips numbers.  #3 is if chunk_seq ever goes 
> down, meaning the index scan did something unexpected.

Yeah, sure. But I think we could probably treat those the same way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_amcheck contrib application

2021-04-08 Thread Mark Dilger



> On Apr 8, 2021, at 3:11 PM, Robert Haas  wrote:
> 
> On Thu, Apr 8, 2021 at 5:21 PM Mark Dilger  
> wrote:
>> All this leads me to believe that we should report the following:
>> 
>> 1) If the total number of chunks retrieved differs from the expected number, 
>> report how many we expected vs. how many we got
>> 2) If the chunk_seq numbers are discontiguous, report each discontiguity.
>> 3) If the index scan returned chunks out of chunk_seq order, report that
>> 4) If any chunk is not the expected size, report that
>> 
>> So, for your example of chunk 1 missing from chunks [0..17], we'd report 
>> that we got one fewer chunks than we expected, that the second chunk seen 
>> was discontiguous from the first chunk seen, that the final chunk seen was 
>> smaller than expected by M bytes, and that the total size was smaller than 
>> we expected by N bytes.  The third of those is somewhat misleading, since 
>> the final chunk was presumably the right size; we just weren't expecting to 
>> hit a partial chunk quite yet.  But I don't see how to make that better in 
>> the general case.
> 
> Hmm, that might be OK. It seems like it's going to be a bit verbose in
> simple cases like 1 missing chunk, but on the plus side, it avoids a
> mountain of output if the raw size has been overwritten with a
> gigantic bogus value. But, how is #2 different from #3? Those sound
> like the same thing to me.

#2 is if chunk_seq goes up but skips numbers.  #3 is if chunk_seq ever goes 
down, meaning the index scan did something unexpected.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread Tom Lane
David Rowley  writes:
> I noticed that nodeFuncs.c appears to have some pretty sloppy work
> done in many of the comments.  Many look like they've just not been
> updated from a copy/paste/edit from another node function.
> The attached aims to clean these up.

I believe every one of these changes is wrong.
For instance:

case T_ScalarArrayOpExpr:
-   coll = InvalidOid;  /* result is always boolean */
+   coll = InvalidOid;  /* result is always InvalidOid 
*/
break;

The point here is that the result type of ScalarArrayOpExpr is boolean,
which has no collation, therefore reporting its collation as InvalidOid
is correct.  Maybe there's a clearer way to say that, but your text is
more confusing not less so.

Likewise, the point of the annotations in exprSetCollation is to not
let a collation be applied to a node that must have a noncollatable
result type.

regards, tom lane




Re: pg_amcheck contrib application

2021-04-08 Thread Robert Haas
On Thu, Apr 8, 2021 at 5:21 PM Mark Dilger  wrote:
> All this leads me to believe that we should report the following:
>
> 1) If the total number of chunks retrieved differs from the expected number, 
> report how many we expected vs. how many we got
> 2) If the chunk_seq numbers are discontiguous, report each discontiguity.
> 3) If the index scan returned chunks out of chunk_seq order, report that
> 4) If any chunk is not the expected size, report that
>
> So, for your example of chunk 1 missing from chunks [0..17], we'd report that 
> we got one fewer chunks than we expected, that the second chunk seen was 
> discontiguous from the first chunk seen, that the final chunk seen was 
> smaller than expected by M bytes, and that the total size was smaller than we 
> expected by N bytes.  The third of those is somewhat misleading, since the 
> final chunk was presumably the right size; we just weren't expecting to hit a 
> partial chunk quite yet.  But I don't see how to make that better in the 
> general case.

Hmm, that might be OK. It seems like it's going to be a bit verbose in
simple cases like 1 missing chunk, but on the plus side, it avoids a
mountain of output if the raw size has been overwritten with a
gigantic bogus value. But, how is #2 different from #3? Those sound
like the same thing to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Autovacuum on partitioned table (autoanalyze)

2021-04-08 Thread Tom Lane
Justin Pryzby  writes:
> On Thu, Apr 08, 2021 at 05:56:25PM -0400, Alvaro Herrera wrote:
>> This new bit reads weird:
>> 
>> +Most parameters are not supported on partitioned tables, with exceptions
>> +noted below; you may specify them for individual leaf partitions.

> "Except where noted, these parameters are not supported on partitioned
> tables."

I think what it's trying to get at is

"Except where noted, these parameters are not supported on partitioned
tables.  However, you can specify them on individual leaf partitions."

regards, tom lane




Lots of incorrect comments in nodeFuncs.c

2021-04-08 Thread David Rowley
I noticed that nodeFuncs.c appears to have some pretty sloppy work
done in many of the comments.  Many look like they've just not been
updated from a copy/paste/edit from another node function.

The attached aims to clean these up.

I plan to push this a later today unless anyone has anything they'd
like to say about it first.

David


fix_numerous_copy_paste_errors_in_nodefuncs_comments.patch
Description: Binary data


Re: Autovacuum on partitioned table (autoanalyze)

2021-04-08 Thread Justin Pryzby
On Thu, Apr 08, 2021 at 05:56:25PM -0400, Alvaro Herrera wrote:
> On 2021-Apr-08, Justin Pryzby wrote:
> 
> > commit 0827e8af70f4653ba17ed773f123a60eadd9f9c9
> > |This also introduces necessary reloptions support for partitioned 
> > tables
> > |(autovacuum_enabled, autovacuum_analyze_scale_factor,
> > |autovacuum_analyze_threshold).  It's unclear how best to document this
> > |aspect.
> > 
> > At least this part needs to be updated - see also ed62d3737.
> > 
> > doc/src/sgml/ref/create_table.sgml-The storage parameters currently
> > doc/src/sgml/ref/create_table.sgml-available for tables are listed 
> > below.
> > ...
> > doc/src/sgml/ref/create_table.sgml:Specifying these parameters for 
> > partitioned tables is not supported,
> > doc/src/sgml/ref/create_table.sgml-but you may specify them for 
> > individual leaf partitions.
> 
> Ah, thanks for pointing it out.  How about the attached?
> 
> This new bit reads weird:
> 
> +Most parameters are not supported on partitioned tables, with exceptions
> +noted below; you may specify them for individual leaf partitions.

"Except where noted, these parameters are not supported on partitioned tables."

-- 
Justin




Re: Autovacuum on partitioned table (autoanalyze)

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Justin Pryzby wrote:

> commit 0827e8af70f4653ba17ed773f123a60eadd9f9c9
> |This also introduces necessary reloptions support for partitioned tables
> |(autovacuum_enabled, autovacuum_analyze_scale_factor,
> |autovacuum_analyze_threshold).  It's unclear how best to document this
> |aspect.
> 
> At least this part needs to be updated - see also ed62d3737.
> 
> doc/src/sgml/ref/create_table.sgml-The storage parameters currently
> doc/src/sgml/ref/create_table.sgml-available for tables are listed below.
> ...
> doc/src/sgml/ref/create_table.sgml:Specifying these parameters for 
> partitioned tables is not supported,
> doc/src/sgml/ref/create_table.sgml-but you may specify them for 
> individual leaf partitions.

Ah, thanks for pointing it out.  How about the attached?

This new bit reads weird:

+Most parameters are not supported on partitioned tables, with exceptions
+noted below; you may specify them for individual leaf partitions.


Maybe "Most parameters are not supported on partitioned tables, with
exceptions noted below; you may specify others for individual leaf
partitions."

-- 
Álvaro Herrera39°49'30"S 73°17'W
>From 37a829ec7b9c46acbbdb02f231288e39d22fcd04 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 8 Apr 2021 17:53:22 -0400
Subject: [PATCH] document reloptions for partitioned tables

---
 doc/src/sgml/ref/create_table.sgml | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 44e50620fd..3cf355cc8d 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1369,8 +1369,8 @@ WITH ( MODULUS numeric_literal, REM
 If a table parameter value is set and the
 equivalent toast. parameter is not, the TOAST table
 will use the table's parameter value.
-Specifying these parameters for partitioned tables is not supported,
-but you may specify them for individual leaf partitions.
+Most parameters are not supported on partitioned tables, with exceptions
+noted below; you may specify them for individual leaf partitions.

 

@@ -1452,6 +1452,7 @@ WITH ( MODULUS numeric_literal, REM
  If true, the autovacuum daemon will perform automatic VACUUM
  and/or ANALYZE operations on this table following the rules
  discussed in .
+ This parameter is supported on partitioned tables.
  If false, this table will not be autovacuumed, except to prevent
  transaction ID wraparound. See  for
  more about wraparound prevention.
@@ -1576,6 +1577,7 @@ WITH ( MODULUS numeric_literal, REM
  
   Per-table value for 
   parameter.
+ This parameter is supported on partitioned tables.
  
 

@@ -1591,6 +1593,7 @@ WITH ( MODULUS numeric_literal, REM
  
   Per-table value for 
   parameter.
+ This parameter is supported on partitioned tables.
  
 

-- 
2.20.1



Re: [PATCH] force_parallel_mode and GUC categories

2021-04-08 Thread Justin Pryzby
The previous patches accidentally included some unrelated changes.

-- 
Justin
>From fd67dd04d1b824a25e113796c235fd9fc9db23e0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 3 Apr 2021 19:06:37 -0500
Subject: [PATCH v2 1/4] track_activity_query_size is STATS_COLLECTOR category

Not Resource Usage / Memory, as since 995fb7420
---
 src/backend/utils/misc/guc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 090abdad8b..e54209995d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3506,7 +3506,7 @@ static struct config_int ConfigureNamesInt[] =
 	},
 
 	{
-		{"track_activity_query_size", PGC_POSTMASTER, RESOURCES_MEM,
+		{"track_activity_query_size", PGC_POSTMASTER, STATS_COLLECTOR,
 			gettext_noop("Sets the size reserved for pg_stat_activity.query, in bytes."),
 			NULL,
 			GUC_UNIT_BYTE
-- 
2.17.0

>From bbed9ed2c3c55b0ccd51358a5c62baa07d1a5ee1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 3 Apr 2021 19:10:01 -0500
Subject: [PATCH v2 2/4] log_autovacuum_min_duration is LOGGING_WHAT

Not AUTOVACUUM, since 48f7e6439 and ef23a7744
---
 doc/src/sgml/config.sgml  | 56 +--
 src/backend/utils/misc/postgresql.conf.sample |  8 +--
 2 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 26628f3e6d..eb154cd669 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6812,6 +6812,34 @@ local0.*/var/log/postgresql
   
  
 
+ 
+  log_autovacuum_min_duration (integer)
+  
+   log_autovacuum_min_duration
+   configuration parameter
+  
+  
+  
+   
+Causes each action executed by autovacuum to be logged if it ran for at
+least the specified amount of time.  Setting this to zero logs
+all autovacuum actions. -1 (the default) disables
+logging autovacuum actions.
+If this value is specified without units, it is taken as milliseconds.
+For example, if you set this to
+250ms then all automatic vacuums and analyzes that run
+250ms or longer will be logged.  In addition, when this parameter is
+set to any value other than -1, a message will be
+logged if an autovacuum action is skipped due to a conflicting lock or a
+concurrently dropped relation.  Enabling this parameter can be helpful
+in tracking autovacuum activity.  This parameter can only be set in
+the postgresql.conf file or on the server command line;
+but the setting can be overridden for individual tables by
+changing table storage parameters.
+   
+  
+ 
+
  
   log_checkpoints (boolean)
   
@@ -7827,34 +7855,6 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
   
  
 
- 
-  log_autovacuum_min_duration (integer)
-  
-   log_autovacuum_min_duration
-   configuration parameter
-  
-  
-  
-   
-Causes each action executed by autovacuum to be logged if it ran for at
-least the specified amount of time.  Setting this to zero logs
-all autovacuum actions. -1 (the default) disables
-logging autovacuum actions.
-If this value is specified without units, it is taken as milliseconds.
-For example, if you set this to
-250ms then all automatic vacuums and analyzes that run
-250ms or longer will be logged.  In addition, when this parameter is
-set to any value other than -1, a message will be
-logged if an autovacuum action is skipped due to a conflicting lock or a
-concurrently dropped relation.  Enabling this parameter can be helpful
-in tracking autovacuum activity.  This parameter can only be set in
-the postgresql.conf file or on the server command line;
-but the setting can be overridden for individual tables by
-changing table storage parameters.
-   
-  
- 
-
  
   autovacuum_max_workers (integer)
   
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9830cfe382..cc9edc410f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -575,6 +575,10 @@
 #log_temp_files = -1			# log temporary files equal or larger
 	# than the specified size in kilobytes;
 	# -1 disables, 0 logs all temp files
+#log_autovacuum_min_duration = -1	# -1 disables, 0 logs all actions and
+	# their durations, > 0 logs only
+	# actions running at least this number
+	# of milliseconds.
 #log_timezone = 'GMT'
 
 #--
@@ -616,10 +620,6 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 	# requires track_counts to also be on.

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2021-04-08 Thread Tomas Vondra
On 4/8/21 2:00 PM, David Rowley wrote:
> On Thu, 8 Apr 2021 at 22:54, David Rowley  wrote:
>>
>> I think the changes in the patch are fairly isolated and the test
>> coverage is now pretty good.  I'm planning on looking at the patch
>> again now and will consider pushing it for PG14.
> 
> I push this with some minor cleanup from the v6 patch I posted earlier.
> 

I ran the same set of benchmarks on the v6, which I think should be
mostly identical to what was committed. I extended the test a bit to
test table with 0, 1, 5 and 1000 rows, and also with int and text
values, to see how it works with more expensive comparators.

I built binaries with gcc 9.2.0 and clang 11.0.0, the full results are
attached. There's a bit of difference between gcc and clang, but the
general behavior is about the same, so I'll only present gcc results to
keep this simple. I'll only throughput comparison to master, so >1.0
means good, <1.0 means bad. If you're interested in actual tps, see the
full results.

For the v5 patch (actually v4-0001 + v5-0002) and v6, the results are:

integer column / v5
===

rows  10/p100/p 16/p 10/s100/s 16/s
---
   0   97%  97%  97% 107% 126% 108%
   1   95%  82%  94% 108% 132% 110%
   5   95%  83%  95% 108% 132% 110%
1000  129% 481% 171% 131% 382% 165%


integer column / v6
===

rows  10/p100/p 16/p 10/s100/s 16/s
---
   0   97%  97%  97%  98%  98%  98%
   1   96%  84%  95%  97%  97%  98%
   5   97%  85%  96%  98%  97%  97%
1000  129% 489% 172% 128% 330% 162%


text column / v5


rows  10/p100/p 16/p 10/s100/s 16/s
---
   0  100% 100% 100% 106% 119% 108%
   1   96%  81%  95% 107% 120% 109%
   5   97%  82%  96% 107% 121% 109%
1000  291%1622% 402% 255%1092% 337%


text column / v6


rows  10/p100/p 16/p 10/s100/s 16/s
---
   0  101% 101% 101%  98%  99%  99%
   1   98%  82%  96%  98%  96%  97%
   5  100%  84%  98%  98%  96%  98%
1000  297%1645% 408% 255%1000% 336%


Overall, the behavior for integer and text columns is the same, for both
patches. There's a couple interesting observations:

1) For the "simple" query mode, v5 helped quite a bit (20-30% speedup),
but v6 does not seem to help at all - it's either same or slower than
unpatched master.

I wonder why is that, and if we could get some of the speedup with v6?
At first I thought that maybe v5 is not building the hash table in cases
where v6 does, but that shouldn't be faster than master.


2) For the "prepared" mode, there's a clear performance hit the longer
the array is (for both v5 and v6). For 100 elements it's about 15%,
which is not great.

I think the reason is fairly simple - building the hash table is not
free, and with few rows it's not worth it - it'd be faster to just
search the array directly. Unfortunately, the logic that makes the
decision to switch to hashing only looks at the array length only, and
ignores the number of rows entirely. So I think if we want to address
this, convert_saop_to_hashed_saop needs to compare

has_build_cost + nrows * hash_lookup_cost

and

nrows * linear_lookup_cost

to make reasonable decision.

I was thinking that maybe we can ignore this, because people probably
have much larger tables in practice. But I'm not sure that's really
true, because there may be other quals and it's possible the preceding
ones are quite selective, filtering most of the rows.

I'm not sure how much of the necessary information we have available in
convert_saop_to_hashed_saop_walker, though :-( I suppose we know the
number of input rows for that plan node, not sure about selectivity of
the other quals, though.

It's also a bit strange that we get speedup for "simple" protocol, while
for "prepared" it gets slower. That seems counter-intuitive, because why
should we see opposite outcomes in those cases? I'd assume that we'll
see either speedup or slowdown in both cases, with the relative change
being more significant in the "prepared" mode.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


saop gcc.ods
Description: application/vnd.oasis.opendocument.spreadsheet


saop llvm.ods
Description: 

Re: Autovacuum on partitioned table (autoanalyze)

2021-04-08 Thread Justin Pryzby
On Thu, Apr 08, 2021 at 01:20:14AM -0400, Alvaro Herrera wrote:
> On 2021-Apr-07, Alvaro Herrera wrote:
> 
> > OK, I bit the bullet and re-did the logic in the way I had proposed
> > earlier in the thread: do the propagation on the collector's side, by
> > sending only the list of ancestors: the collector can read the tuple
> > change count by itself, to add it to each ancestor.  This seems less
> > wasteful.  Attached is v16 which does it that way and seems to work
> > nicely under my testing.
> 
> Pushed with this approach.  Thanks for persisting with this.

commit 0827e8af70f4653ba17ed773f123a60eadd9f9c9
|This also introduces necessary reloptions support for partitioned tables
|(autovacuum_enabled, autovacuum_analyze_scale_factor,
|autovacuum_analyze_threshold).  It's unclear how best to document this
|aspect.

At least this part needs to be updated - see also ed62d3737.

doc/src/sgml/ref/create_table.sgml-The storage parameters currently
doc/src/sgml/ref/create_table.sgml-available for tables are listed below.
...
doc/src/sgml/ref/create_table.sgml:Specifying these parameters for 
partitioned tables is not supported,
doc/src/sgml/ref/create_table.sgml-but you may specify them for individual 
leaf partitions.

-- 
Justin




Re: pgsql: autovacuum: handle analyze for partitioned tables

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Zhihong Yu wrote:

> Hi,
> Within truncate_update_partedrel_stats(), dirty is declared within the loop.
> +   if (rd_rel->reltuples != 0)
> +   {
> ...
> +   if (dirty)
> 
> The two if blocks can be merged. The variable dirty can be dropped.

Hi, thanks for reviewing.  Yes, evidently I copied vac_update_relstats
too closely -- that boolean is not necessary.

-- 
Álvaro Herrera   Valdivia, Chile




Re: pg_amcheck contrib application

2021-04-08 Thread Mark Dilger



> On Apr 8, 2021, at 1:05 PM, Robert Haas  wrote:
> 
> On Thu, Apr 8, 2021 at 3:02 PM Mark Dilger  
> wrote:
>> Imagine a toasted attribute with 18 chunks numbered [0..17].  Then we update 
>> the toast to have only 6 chunks numbered [0..5] except we corruptly keep 
>> chunks numbered [12..17] in the toast table.  We'd rather see a report like 
>> this:
> [ toast value NNN chunk NNN has sequence number NNN, but expected
> sequence number NNN ]
>> than one like this:
> [ toast value NNN contains chunk NNN where chunk NNN was expected ]
>> 
>> because saying the toast value ended at "chunk 12" after saying that it 
>> contains "chunk 17" is contradictory.  You need the distinction between the 
>> chunk number and the chunk sequence number, since in corrupt circumstances 
>> they may not be the same.
> 
> Hmm, I see your point, and that's a good example to illustrate it.
> But, with that example in front of me, I am rather doubtful that
> either of these is what users actually want. Consider the case where I
> should have chunks 0..17 and chunk 1 is just plain gone. This, by the
> way, seems like a pretty likely case to arise in practice, since all
> we need is for a block to get truncated away or zeroed erroneously, or
> for a tuple to get pruned that shouldn't. With either of the above
> schemes, I guess we're going to get a message about every chunk from 2
> to 17, complaining that they're all misnumbered. We might also get a
> complaint that the last chunk is the wrong size, and that the total
> number of chunks isn't right. What we really want is a single
> complaint saying chunk 1 is missing.
> 
> Likewise, in your example, I sort of feel like what I really want,
> rather than either of the above outputs, is to get some messages like
> this:
> 
> toast value NNN contains unexpected extra chunk [12-17]
> 
> Both your phrasing for those messages and what I suggested make it
> sound like the problem is that the chunk number is wrong. But that
> doesn't seem like it's taking the right view of the situation. Chunks
> 12-17 shouldn't exist at all, and if they do, we should say that, e.g.
> by complaining about something like "toast value 16444 chunk 12
> follows last expected chunk 5"
> 
> In other words, I don't buy the idea that the user will accept the
> idea that there's a chunk number and a chunk sequence number, and that
> they should know the difference between those things and what each of
> them are. They're entitled to imagine that there's just one thing, and
> that we're going to tell them about value that are extra or missing.
> The fact that we're not doing that seems like it's just a matter of
> missing code.

Somehow, we have to get enough information about chunk_seq discontinuity into 
the output that if the user forwards it to -hackers, or if the output comes 
from a buildfarm critter, that we have all the information to help diagnose 
what went wrong.

As a specific example, if the va_rawsize suggests 2 chunks, and we find 150 
chunks all with contiguous chunk_seq values, that is different from a debugging 
point of view than if we find 150 chunks with chunk_seq values spread all over 
the [0..MAXINT] range.  We can't just tell the user that there were 148 extra 
chunks.  We also shouldn't phrase the error in terms of "extra chunks", since 
it might be the va_rawsize that is corrupt.

I agree that the current message output might be overly verbose in how it 
reports this information.  Conceptually, we want to store up information about 
the chunk issues and report them all at once, but that's hard to do in general, 
as the number of chunk_seq discontinuities might be quite large, much too large 
to fit reasonably into any one message.  Maybe we could report just the first N 
discontinuities rather than all of them, but if somebody wants to open a hex 
editor and walk through the toast table, they won't appreciate having the 
corruption information truncated like that.

All this leads me to believe that we should report the following:

1) If the total number of chunks retrieved differs from the expected number, 
report how many we expected vs. how many we got
2) If the chunk_seq numbers are discontiguous, report each discontiguity.
3) If the index scan returned chunks out of chunk_seq order, report that
4) If any chunk is not the expected size, report that

So, for your example of chunk 1 missing from chunks [0..17], we'd report that 
we got one fewer chunks than we expected, that the second chunk seen was 
discontiguous from the first chunk seen, that the final chunk seen was smaller 
than expected by M bytes, and that the total size was smaller than we expected 
by N bytes.  The third of those is somewhat misleading, since the final chunk 
was presumably the right size; we just weren't expecting to hit a partial chunk 
quite yet.  But I don't see how to make that better in the general case.

> If we start the index scan and get chunk 4, we can
> easily emit messages for chunks 0..3 

Re: Dubious coding in nbtinsert.c

2021-04-08 Thread Tom Lane
Peter Geoghegan  writes:
> You had a near-identical complaint about a compiler warning that led
> to my commit d64f1cdf2f4 -- that one involved _bt_check_unique()'s
> curitup, while this one is about curitemid. While I have no problem
> silencing this compiler warning now, I don't see any reason to not
> just do the same thing again. Which is to initialize the pointer to
> NULL.

Works for me; if there is any bug in the logic, we'll get a core dump
and can investigate.

regards, tom lane




Re: pgsql: autovacuum: handle analyze for partitioned tables

2021-04-08 Thread Zhihong Yu
On Thu, Apr 8, 2021 at 1:12 PM Alvaro Herrera 
wrote:

> On 2021-Apr-08, Tom Lane wrote:
>
> > > So I tend to think that my initial instinct was the better direction:
> we
> > > should not be doing any find_all_inheritors() here at all, but instead
> > > rely on pg_class.reltuples to be set for the partitioned table.
> >
> > +1
>
> This patch does that.
>
> --
> Álvaro Herrera39°49'30"S 73°17'W
> "I dream about dreams about dreams", sang the nightingale
> under the pale moon (Sandman)
>

Hi,
Within truncate_update_partedrel_stats(), dirty is declared within the loop.
+   if (rd_rel->reltuples != 0)
+   {
...
+   if (dirty)

The two if blocks can be merged. The variable dirty can be dropped.

Cheers


Re: pgsql: autovacuum: handle analyze for partitioned tables

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Tom Lane wrote:

> > So I tend to think that my initial instinct was the better direction: we
> > should not be doing any find_all_inheritors() here at all, but instead
> > rely on pg_class.reltuples to be set for the partitioned table.
> 
> +1

This patch does that.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)
>From a54701552f2ba9295aae4fe0fc22c7bac912bf45 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 8 Apr 2021 11:10:44 -0400
Subject: [PATCH] Set pg_class.reltuples for partitioned tables

---
 src/backend/commands/analyze.c  | 12 +++
 src/backend/commands/tablecmds.c| 51 -
 src/backend/postmaster/autovacuum.c | 39 +-
 3 files changed, 63 insertions(+), 39 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 5bdaceefd5..2de699d838 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -656,6 +656,18 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 in_outer_xact);
 		}
 	}
+	else if (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		/*
+		 * Partitioned tables don't have storage, so we don't set any fields in
+		 * their pg_class entries except for relpages, which is necessary for
+		 * auto-analyze to work properly.
+		 */
+		vac_update_relstats(onerel, -1, totalrows,
+			0, false, InvalidTransactionId,
+			InvalidMultiXactId,
+			in_outer_xact);
+	}
 
 	/*
 	 * Now report ANALYZE to the stats collector.  For regular tables, we do
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1f19629a94..deca860c80 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -337,6 +337,7 @@ typedef struct ForeignTruncateInfo
 static void truncate_check_rel(Oid relid, Form_pg_class reltuple);
 static void truncate_check_perms(Oid relid, Form_pg_class reltuple);
 static void truncate_check_activity(Relation rel);
+static void truncate_update_partedrel_stats(List *parted_rels);
 static void RangeVarCallbackForTruncate(const RangeVar *relation,
 		Oid relId, Oid oldRelId, void *arg);
 static List *MergeAttributes(List *schema, List *supers, char relpersistence,
@@ -1755,6 +1756,7 @@ ExecuteTruncateGuts(List *explicit_rels,
 {
 	List	   *rels;
 	List	   *seq_relids = NIL;
+	List	   *parted_rels = NIL;
 	HTAB	   *ft_htab = NULL;
 	EState	   *estate;
 	ResultRelInfo *resultRelInfos;
@@ -1908,9 +1910,15 @@ ExecuteTruncateGuts(List *explicit_rels,
 		Relation	rel = (Relation) lfirst(lc1);
 		int			extra = lfirst_int(lc2);
 
-		/* Skip partitioned tables as there is nothing to do */
+		/*
+		 * Save OID of partitioned tables for later; nothing else to do for
+		 * them here.
+		 */
 		if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			parted_rels = lappend_oid(parted_rels, RelationGetRelid(rel));
 			continue;
+		}
 
 		/*
 		 * Build the lists of foreign tables belonging to each foreign server
@@ -2061,6 +2069,9 @@ ExecuteTruncateGuts(List *explicit_rels,
 		ResetSequence(seq_relid);
 	}
 
+	/* Reset partitioned tables' pg_class.reltuples */
+	truncate_update_partedrel_stats(parted_rels);
+
 	/*
 	 * Write a WAL record to allow this set of actions to be logically
 	 * decoded.
@@ -2207,6 +2218,44 @@ truncate_check_activity(Relation rel)
 	CheckTableNotInUse(rel, "TRUNCATE");
 }
 
+/*
+ * Update pg_class.reltuples for all the given partitioned tables to 0.
+ */
+static void
+truncate_update_partedrel_stats(List *parted_rels)
+{
+	Relation	pg_class;
+	HeapTuple	tuple;
+	Form_pg_class rd_rel;
+	ListCell   *lc;
+
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	foreach(lc, parted_rels)
+	{
+		Oid		relid = lfirst_oid(lc);
+		bool	dirty = false;
+
+		tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "could not find tuple for relation %u", relid);
+		rd_rel = (Form_pg_class) GETSTRUCT(tuple);
+
+		if (rd_rel->reltuples != 0)
+		{
+			rd_rel->reltuples = (float4) 0;
+			dirty = true;
+		}
+
+		if (dirty)
+			heap_inplace_update(pg_class, tuple);
+
+		heap_freetuple(tuple);
+	}
+
+	table_close(pg_class, RowExclusiveLock);
+}
+
 /*
  * storage_name
  *	  returns the name corresponding to a typstorage/attstorage enum value
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index aef9ac4dd2..a799544738 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3209,44 +3209,7 @@ relation_needs_vacanalyze(Oid relid,
 	 */
 	if (PointerIsValid(tabentry) && AutoVacuumingActive())
 	{
-		if (classForm->relkind != RELKIND_PARTITIONED_TABLE)
-		{
-			reltuples = classForm->reltuples;
-		}
-		else
-		{
-			/*
-			 * If the relation is a partitioned table, we must add up
-			 * children's reltuples.
-			 */
-	

Re: [HACKERS] Custom compression methods

2021-04-08 Thread Robert Haas
On Thu, Apr 8, 2021 at 3:38 PM Justin Pryzby  wrote:
> It looks like this should not remove the word "data" ?

Oh, yes, right.

>  The compression technique used for either in-line or out-of-line compressed
> -data is a fairly simple and very fast member
> -of the LZ family of compression techniques.  See
> -src/common/pg_lzcompress.c for the details.
> +can be selected using the COMPRESSION option on a 
> per-column
> +basis when creating a table. The default for columns with no explicit setting
> +is taken from the value of .
>
> I thought this patch would need to update parts about borrowing 2 spare bits,
> but maybe that's the wrong header..before.

We're not borrowing any more bits from the places where we were
borrowing 2 bits before. We are borrowing 2 bits from places that
don't seem to be discussed in detail here, where no bits were borrowed
before.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_amcheck contrib application

2021-04-08 Thread Robert Haas
On Thu, Apr 8, 2021 at 3:02 PM Mark Dilger  wrote:
> Imagine a toasted attribute with 18 chunks numbered [0..17].  Then we update 
> the toast to have only 6 chunks numbered [0..5] except we corruptly keep 
> chunks numbered [12..17] in the toast table.  We'd rather see a report like 
> this:
[ toast value NNN chunk NNN has sequence number NNN, but expected
sequence number NNN ]
> than one like this:
[ toast value NNN contains chunk NNN where chunk NNN was expected ]
>
> because saying the toast value ended at "chunk 12" after saying that it 
> contains "chunk 17" is contradictory.  You need the distinction between the 
> chunk number and the chunk sequence number, since in corrupt circumstances 
> they may not be the same.

Hmm, I see your point, and that's a good example to illustrate it.
But, with that example in front of me, I am rather doubtful that
either of these is what users actually want. Consider the case where I
should have chunks 0..17 and chunk 1 is just plain gone. This, by the
way, seems like a pretty likely case to arise in practice, since all
we need is for a block to get truncated away or zeroed erroneously, or
for a tuple to get pruned that shouldn't. With either of the above
schemes, I guess we're going to get a message about every chunk from 2
to 17, complaining that they're all misnumbered. We might also get a
complaint that the last chunk is the wrong size, and that the total
number of chunks isn't right. What we really want is a single
complaint saying chunk 1 is missing.

Likewise, in your example, I sort of feel like what I really want,
rather than either of the above outputs, is to get some messages like
this:

toast value NNN contains unexpected extra chunk [12-17]

Both your phrasing for those messages and what I suggested make it
sound like the problem is that the chunk number is wrong. But that
doesn't seem like it's taking the right view of the situation. Chunks
12-17 shouldn't exist at all, and if they do, we should say that, e.g.
by complaining about something like "toast value 16444 chunk 12
follows last expected chunk 5"

In other words, I don't buy the idea that the user will accept the
idea that there's a chunk number and a chunk sequence number, and that
they should know the difference between those things and what each of
them are. They're entitled to imagine that there's just one thing, and
that we're going to tell them about value that are extra or missing.
The fact that we're not doing that seems like it's just a matter of
missing code. If we start the index scan and get chunk 4, we can
easily emit messages for chunks 0..3 right on the spot, declaring them
missing. Things do get a bit hairy if the index scan returns values
out of order: what if it gives us chunk_seq = 2 and then chunk_seq =
1? But I think we could handle that by just issuing a complaint in any
such case that "toast index returns chunks out of order for toast
value NNN" and stopping further checking of that toast value.

> Purely as manual testing, and not part of the patch, I hacked the backend a 
> bit to allow direct modification of the toast table.  After corrupting the 
> toast with the following bit of SQL:
>
> WITH chunk_limit AS (
> SELECT chunk_id, MAX(chunk_seq) AS maxseq
> FROM $toastname
> GROUP BY chunk_id)
> INSERT INTO $toastname (chunk_id, chunk_seq, chunk_data)
> (SELECT t.chunk_id,
> t.chunk_seq + cl.maxseq + CASE WHEN 
> t.chunk_seq < 3 THEN 1 ELSE 7 END,
> t.chunk_data
> FROM $toastname t
> INNER JOIN chunk_limit cl
> ON t.chunk_id = cl.chunk_id)
>
> pg_amcheck reports the following corruption messages:
>
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 chunk 6 follows last expected chunk 5
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 chunk 7 follows last expected chunk 5
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 chunk 8 follows last expected chunk 5
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 chunk 9 has sequence number 15, but expected sequence 
> number 9
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 chunk 10 has sequence number 16, but expected 
> sequence number 10
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 chunk 11 has sequence number 17, but expected 
> sequence number 11
> # heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
> # toast value 16444 was expected to end at chunk 6, but ended at chunk 12
>
> I think if we'd left out the 

Re: Dubious coding in nbtinsert.c

2021-04-08 Thread Peter Geoghegan
On Thu, Apr 8, 2021 at 12:19 PM Tom Lane  wrote:
> Buildfarm member curculio, which doesn't usually produce
> uninitialized-variable warnings, is showing one here:
>
> nbtinsert.c: In function '_bt_doinsert':
> nbtinsert.c:411: warning: 'curitemid' may be used uninitialized in this 
> function
> nbtinsert.c:411: note: 'curitemid' was declared here
>
> I can see its point: curitemid is set only if !inposting.
> While the first two uses of the value are clearly reached
> only if !inposting, it's FAR from clear that it's impossible
> to reach "ItemIdMarkDead(curitemid);" without a valid value.
> Could you clean that up?

I'll take care of it shortly.

You had a near-identical complaint about a compiler warning that led
to my commit d64f1cdf2f4 -- that one involved _bt_check_unique()'s
curitup, while this one is about curitemid. While I have no problem
silencing this compiler warning now, I don't see any reason to not
just do the same thing again. Which is to initialize the pointer to
NULL.

-- 
Peter Geoghegan




Re: [HACKERS] Custom compression methods

2021-04-08 Thread Justin Pryzby
On Thu, Apr 08, 2021 at 02:58:04PM -0400, Robert Haas wrote:
> On Thu, Apr 8, 2021 at 11:32 AM David Steele  wrote:
> > It looks like this CF entry should have been marked as committed so I
> > did that.
> 
> Thanks.
> 
> Here's a patch for the doc update which was mentioned as an open item 
> upthread.

Thanks too.

It looks like this should not remove the word "data" ?

 The compression technique used for either in-line or out-of-line compressed
-data is a fairly simple and very fast member
-of the LZ family of compression techniques.  See
-src/common/pg_lzcompress.c for the details.
+can be selected using the COMPRESSION option on a per-column
+basis when creating a table. The default for columns with no explicit setting
+is taken from the value of .

I thought this patch would need to update parts about borrowing 2 spare bits,
but maybe that's the wrong header..

-- 
Justin




Dubious coding in nbtinsert.c

2021-04-08 Thread Tom Lane
Buildfarm member curculio, which doesn't usually produce
uninitialized-variable warnings, is showing one here:

nbtinsert.c: In function '_bt_doinsert':
nbtinsert.c:411: warning: 'curitemid' may be used uninitialized in this function
nbtinsert.c:411: note: 'curitemid' was declared here

I can see its point: curitemid is set only if !inposting.
While the first two uses of the value are clearly reached
only if !inposting, it's FAR from clear that it's impossible
to reach "ItemIdMarkDead(curitemid);" without a valid value.
Could you clean that up?

regards, tom lane




Re: multi-install PostgresNode fails with older postgres versions

2021-04-08 Thread Mark Dilger


> On Apr 7, 2021, at 8:43 AM, Andrew Dunstan  wrote:
> 
> 
> On 4/7/21 1:03 AM, Mark Dilger wrote:
>> The v1 patch supported postgres versions back to 8.4, but v2 pushes that 
>> back to 8.1.
>> 
>> The version of PostgresNode currently committed relies on IPC::Run in a way 
>> that is subtly wrong.  The first time IPC::Run::run(X, ...) is called, it 
>> uses the PATH as it exists at that time, resolves the path for X, and caches 
>> it.  Subsequent calls to IPC::Run::run(X, ...) use the cached path, without 
>> respecting changes to $ENV{PATH}.  In practice, this means that:
>> 
>>  use PostgresNode;
>> 
>>  my $a = PostgresNode->get_new_node('a', install_path => '/my/install/8.4');
>>  my $b = PostgresNode->get_new_node('b', install_path => '/my/install/9.0');
>> 
>>  $a->safe_psql(...)   # <=== Resolves and caches 'psql' as 
>> /my/install/8.4/bin/psql
>> 
>>  $b->safe_psql(...)   # <=== Executes /my/install/8.4/bin/psql, not 
>> /my/install/9.0/bin/psql as one might expect
>> 
>> PostgresNode::safe_psql() and PostgresNode::psql() both suffer from this, 
>> and similarly PostgresNode::pg_recvlogical_upto() because the path to 
>> pg_recvlogical gets cached.  Calls to initdb and pg_ctl do not appear to 
>> suffer this problem, as they are ultimately handled by perl's system() call, 
>> not by IPC::Run::run.
>> 
>> Since postgres commands work fairly similarly from one release to another, 
>> this can cause subtle and hard to diagnose bugs in regression tests.  The 
>> fix in v2-0001 works for me, as demonstrated by v2-0002, but whether the fix 
>> in the attached v2 patch set gets used or not, I think something needs to be 
>> done to fix this.
>> 
>> 
> 
> Awesome work. The IPC::Run behaviour is darned unfriendly, and AFAICS
> completely undocumented. It can't even be easily modified by a client
> because the cache is stashed in a lexical variable.

Yes, I noticed that, too.  Even if we could get a patch accepted into IPC::Run, 
we'd need to be compatible with older versions.  So there doesn't seem to be 
any option but to work around the issue.

> You fix looks good.

Thanks for reviewing!


> other notes:
> 
> 
> . needs a perltidy run, some lines are too long (see
> src/tools/pgindent/perltidyrc)
> 
> 
> . Please use an explicit return here:
> 
> 
> +# Return an array reference
> +[ @result ];

Done.


> . I'm not sure the computation in _pg_version_cmp is right. What if the
> number of elements differ? As I read it you would return 0 for a
> comparison of '1.2' and '1.2.3'. Is that what's intended?

Yes, that is intended.  '1.2' and '1.2.0' are not the same.  '1.2' means "some 
unspecified micro release or development version of 1.2", whereas '1.2.0' does 
not.

This is useful for things like $node->at_least_version("13") returning true for 
development versions of version 13, which are otherwise less than (not equal 
to) 13.0

> . The second patch has a bunch of stuff it doesn't need. The control
> file should be unnecessary as should all the lines above 'ifdef
> USE_PGXS' in the Makefile except 'TAP_TESTS = 1'

Done.

Yeah, I started the second patch as simply a means of testing the first and 
didn't clean it up after copying boilerplate from elsewhere.  The second patch 
has turned into something possibly worth keeping in its own right, and having 
the build farm execute on a regular basis.

> . the test script should have a way of passing a non-default version
> file to CrossVersion::nodes(). Possible get it from an environment variable?

Good idea.  I changed it to use $ENV{PG_TEST_VERSIONS_FILE}.  I'm open to other 
names for this variable.

> . I'm somewhat inclined to say that CrossVersion should just return a
> {name => path} map, and let the client script do the node creation. Then
> crossVersion doesn't need to know anything much about the
> infrastructure. But I could possibly be persuaded otherwise. Also, maybe
> it belongs in src/test/perl.

Hmm.  That's a good thought. I've moved it to src/test/perl with the change you 
suggest.

> . This line appears deundant, the variable is not referenced:
> 
> 
> +my $path = $ENV{PATH};

Removed.

> Also these lines at the bottom of CrossVersion.pm are redundant:
> 
> 
> +use strict;
> +use warnings;

Yeah, those are silly.  Removed.

This patch has none of the Object Oriented changes Alvaro and Jehan-Guillaume 
requested, but that should not be construed as an argument against their 
request.  It's just not handled in this particular patch.



v3-0001-Extending-PostgresNode-cross-version-functionalit.patch
Description: Binary data


v3-0002-Adding-modules-test_cross_version.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg_amcheck contrib application

2021-04-08 Thread Mark Dilger


> On Apr 7, 2021, at 1:16 PM, Robert Haas  wrote:
> 
> On Sun, Apr 4, 2021 at 8:02 PM Mark Dilger  
> wrote:
>> v18-0001 - Finishes work started in commit 3b6c1259f9 that was overlooked 
>> owing to how I had separated the changes in v17-0002 vs. v17-0003
> 
> Committed.

Thank you.

>> v18-0002 - Postpones the toast checks for a page until after the main table 
>> page lock is released
> 
> Committed, but I changed list_free() to list_free_deep() to avoid a
> memory leak, and I revised the commit message to mention the important
> point that we need to avoid following TOAST pointers from
> potentially-prunable tuples.

Thank you, and yes, I agree with that change.

>> v18-0003 - Improves the corruption messages in ways already discussed 
>> earlier in this thread.  Changes the tests to expect the new messages, but 
>> adds no new checks
> 
> Kibitizing your message wording:
> 
> "toast value %u chunk data is null" -> "toast value %u chunk %d has
> null data". We can mention the chunk number this way.

Changed.

> "toast value %u corrupt extended chunk has invalid varlena header: %0x
> (sequence number %d)" -> "toast value %u chunk %d has invalid varlena
> header %0x". We can be more consistent about how we incorporate the
> chunk number into the text, and we don't really need to include the
> word corrupt, because all of these are corruption complaints, and I
> think it looks better without the colon.

Changed.

> "toast value %u chunk sequence number %u does not match the expected
> sequence number %u" -> "toast value %u contains chunk %d where chunk
> %d was expected". Shorter. Uses %d for a sequence number instead of
> %u, which I think is correct -- anyway we should have them all one way
> or all the other. I think I'd rather ditch the "sequence number"
> technology and just talk about "chunk %d" or whatever.

I don't agree with this one.  I do agree with changing the message, but not to 
the message you suggest.

Imagine a toasted attribute with 18 chunks numbered [0..17].  Then we update 
the toast to have only 6 chunks numbered [0..5] except we corruptly keep chunks 
numbered [12..17] in the toast table.  We'd rather see a report like this:

# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 6 has sequence number 12, but expected sequence 
number 6
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 7 has sequence number 13, but expected sequence 
number 7
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 8 has sequence number 14, but expected sequence 
number 8
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 9 has sequence number 15, but expected sequence 
number 9
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 10 has sequence number 16, but expected sequence 
number 10
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 11 has sequence number 17, but expected sequence 
number 11
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 was expected to end at chunk 6, but ended at chunk 12

than one like this:

# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 12 where chunk 6 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 13 where chunk 7 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 14 where chunk 8 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 15 where chunk 9 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 16 where chunk 10 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 17 where chunk 11 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 was expected to end at chunk 6, but ended at chunk 12

because saying the toast value ended at "chunk 12" after saying that it 
contains "chunk 17" is contradictory.  You need the distinction between the 
chunk number and the chunk sequence number, since in corrupt circumstances they 
may not be the same.

> "toast value %u chunk sequence number %u exceeds the end chunk
> sequence number %u" -> "toast value %u chunk %d follows last expected
> chunk %d"

Changed.

> "toast value %u chunk size %u differs from the expected size %u" ->
> "toast value %u chunk %d has size %u, but expected size %u"

Changed.

> Other complaints:
> 
> Your commit message fails to mention the addition of
> 

Re: [HACKERS] Custom compression methods

2021-04-08 Thread Robert Haas
On Thu, Apr 8, 2021 at 11:32 AM David Steele  wrote:
> It looks like this CF entry should have been marked as committed so I
> did that.

Thanks.

Here's a patch for the doc update which was mentioned as an open item upthread.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


update-storage-docs.patch
Description: Binary data


Re: pgsql: autovacuum: handle analyze for partitioned tables

2021-04-08 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Apr-08, Tom Lane wrote:
>> Yeah.  I hit this on another machine that isn't using EXEC_BACKEND,
>> and I concur it looks more like a race condition.  I think the problem
>> is that autovacuum is calling find_all_inheritors() on a relation it
>> has no lock on, contrary to that function's API spec.

> Hmm.  Autovacuum tries hard to avoid grabbing locks on relations until
> really needed (at vacuum/analyze time), which is why all these tests
> only use data that can be found in the pg_class rows and pgstat entries.

Yeah, I was worried about that.

> So I tend to think that my initial instinct was the better direction: we
> should not be doing any find_all_inheritors() here at all, but instead
> rely on pg_class.reltuples to be set for the partitioned table.

+1

regards, tom lane




Re: pgsql: autovacuum: handle analyze for partitioned tables

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Tom Lane wrote:

> Yeah.  I hit this on another machine that isn't using EXEC_BACKEND,
> and I concur it looks more like a race condition.  I think the problem
> is that autovacuum is calling find_all_inheritors() on a relation it
> has no lock on, contrary to that function's API spec.  find_all_inheritors
> assumes the OID it's given is valid and locked, and adds it to the
> result list automatically.  Then it looks for children, and won't find
> any in the race case where somebody else just dropped the table.

Hmm.  Autovacuum tries hard to avoid grabbing locks on relations until
really needed (at vacuum/analyze time), which is why all these tests
only use data that can be found in the pg_class rows and pgstat entries.
So I tend to think that my initial instinct was the better direction: we
should not be doing any find_all_inheritors() here at all, but instead
rely on pg_class.reltuples to be set for the partitioned table.

I'll give that another look.  Most places already assume that reltuples
isn't set for a partitioned table, so they shouldn't care.  I wonder,
though, whether we should set relpages to some value other than 0 or -1.
(I'm inclined not to, since autovacuum does not use it.)

-- 
Álvaro Herrera   Valdivia, Chile




Re: Have I found an interval arithmetic bug?

2021-04-08 Thread Bryn Llewellyn
> On 08-Apr-2021, at 10:24, Bruce Momjian  wrote:
> 
> On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
>> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
>> Well, bug or not, we are not going to change back branches for this, and
>> if you want a larger discussion, it will have to wait for PG 15.
>> 
 https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT=gmail-imap=161850748900=AOvVaw2h2TNbK7O41zsDn8HfD88C
 « …field values can have fractional parts; for example '1.5 week' or 
 '01:02:03.45'. Such input is converted to the appropriate number of 
 months, days, and seconds for storage. When this would result in a 
 fractional number of months or days, the fraction is added to the 
 lower-order fields using the conversion factors 1 month = 30 days and 1 
 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only 
 seconds will ever be shown as fractional on output. »
>> 
>> I see that.  What is not clear here is how far we flow down.  I was
>> looking at adding documentation or regression tests for that, but was
>> unsure.  I adjusted the docs slightly in the attached patch.
> 
> Here is an updated patch, which will be for PG 15.  It updates the
> documentation to state:
> 
>   The fractional parts are used to compute appropriate values for the next
>   lower-order internal fields (months, days, seconds).
> 
> It removes the flow from fractional months/weeks to
> hours-minutes-seconds, and adds missing rounding for fractional
> computations.

Thank you Bruce. I look forward to documenting this new algorithm for 
YugabyteDB. The algorithm implements the transformation from this:

[
  yy_in numeric,
  mo_in numeric,
  dd_in numeric,
  hh_in numeric,
  mi_in numeric,
  ss_in numeric
]

to this:

[
  mo_internal_representation int,
  dd_internal_representation int,
  ss_internal_representation numeric(1000,6)
]

I am convinced that a prose account of the algorithm, by itself, is not the 
best way to tell the reader the rules that the algorithm implements. Rather, 
psuedocode is needed. I mentioned before that, better still, is actual 
executable PL/pgSQL code. (I can expect readers to be fluent in PL/pgSQL.) 
Given this executable simulation, an informal prose sketch of what it does will 
definitely add value.

May I ask you to fill in the body of this stub by translating the C that you 
have in hand?

create type internal_representation_t as(
  mo_internal_representation int,
  dd_internal_representation int,
  ss_internal_representation numeric(1000,6));

create function internal_representation(
  yy_in numeric default 0,
  mo_in numeric default 0,
  dd_in numeric default 0,
  hh_in numeric default 0,
  mi_in numeric default 0,
  ss_in numeric default 0)
  returns internal_representation_t
  language plpgsql
as $body$
declare
  mo_internal_representation  int not null := 0;
  dd_internal_representation  int not null := 0;
  ss_internal_representation  numeric not null := 0.0;

  ok constant boolean :=
(yy_in is not null) and
(mo_in is not null) and
(dd_in is not null) and
(hh_in is not null) and
(mi_in is not null) and
(ss_in is not null);
begin
  assert ok, 'No actual argument, when provided, may be null';

  -- The algorithm.

  return (mo_internal_representation, dd_internal_representation, 
ss_internal_representation)::internal_representation_t;
end;
$body$;

By the way, I believe that a user might well decide always to supply all the 
fields in a "from text to interval" typecast, except for the seconds, as 
integral values. This, after all, is what the "make_interval()" function 
enforces. But, because the typecast approach allows non-integral values, the 
reference documentation must explain the rules unambiguously so that the reader 
can predict the outcome of any ad hoc test that they might try.

It's a huge pity that the three values of the internal representation cannot be 
observed directly using SQL because each behaves with different semantics when 
an interval value is added to a timestamptz value. However, as a second best 
(and knowing the algorithm above), a user can create interval values where only 
one of the three fields is populated and test their understanding of the 
semantic rules that way.



Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread Simon Riggs
On Thu, 8 Apr 2021 at 18:15, Alvaro Herrera  wrote:
>
> On 2021-Apr-08, Simon Riggs wrote:
>
> > On Thu, 8 Apr 2021 at 16:58, David Steele  wrote:
>
> > > It's not clear to me which patch is which, so perhaps move one CF entry
> > > to next CF and clarify which patch is current?
> >
> > Entry: Maximize page freezing
> > has this patch, perfectly fine, awaiting review from Alvaro since 29 Nov.
> > one_freeze_then_max_freeze.v7.patch
>
> Oh dear, was this waiting on me?  It was not in my to-do, so I didn't
> pay close attention.

No problem, if I felt the idea deserved priority attention I would
have pinged you.

I think I'll open a new thread later to allow it to be discussed
without confusion.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




weird interaction between asynchronous queries and pg_sleep

2021-04-08 Thread Merlin Moncure
Consider the following snippet

create table data as select generate_series(1,100) s;

do $d$
begin
  PERFORM * FROM dblink_connect('test','');

  PERFORM * from dblink_send_query('test', 'SELECT * FROM data');

  LOOP
if dblink_is_busy('test') = 0
THEN
  PERFORM * FROM dblink_get_result('test') AS R(V int);
  PERFORM * FROM dblink_get_result('test') AS R(V int);
  RETURN;
END IF;

PERFORM pg_sleep(.001);
  END LOOP;

  PERFORM * FROM dblink_disconnect('test');
END;
$d$;

What's interesting here is that, when I vary the sleep parameter, I get:
0: .4 seconds (per top, this is busywait), same as running synchronous.
0.01: 1.4 seconds
0.001: 2.4 seconds
0.01: 10.6 seconds
0.1: does not terminate

This effect is only noticeable when the remote query is returning
volumes of data.  My question is, is there any way to sleep loop
client side without giving up 3x performance penalty?  Why is that
that when more local sleep queries are executed, performance improves?

merlin




Re: test runner (was Re: SQL-standard function body)

2021-04-08 Thread Andres Freund
On 2021-04-08 10:50:39 -0700, Andres Freund wrote:
> It's hard to convey just how much nicer it is to see a progress report
> during the test, see the failing tests at the end, without needing to
> wade through reams of log output.  The output of the individual tests is
> in testlog.txt referenced above.

https://anarazel.de/public/t/pg-meson-test-screencap-2021-04-08_10.58.26.mkv

Greetings,

Andres Freund




Re: pgsql: autovacuum: handle analyze for partitioned tables

2021-04-08 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Apr-08, Tom Lane wrote:
>> Looks like this has issues under EXEC_BACKEND:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2021-04-08%2005%3A50%3A08

> Hmm, I couldn't reproduce this under EXEC_BACKEND or otherwise, but I
> think this is unrelated to that, but rather a race condition.

Yeah.  I hit this on another machine that isn't using EXEC_BACKEND,
and I concur it looks more like a race condition.  I think the problem
is that autovacuum is calling find_all_inheritors() on a relation it
has no lock on, contrary to that function's API spec.  find_all_inheritors
assumes the OID it's given is valid and locked, and adds it to the
result list automatically.  Then it looks for children, and won't find
any in the race case where somebody else just dropped the table.
So we come back to relation_needs_vacanalyze with a list of just the
original rel OID, and since this loop believes that every syscache fetch
it does will succeed, kaboom.

I do not think it is sane to do find_all_inheritors() with no lock,
so I'd counsel doing something about that rather than band-aiding the
symptom.  On the other hand, it's also not really okay not to have
an if-test-and-elog after the SearchSysCache call.  "A cache lookup
cannot fail" is not an acceptable assumption in my book.

BTW, another thing that looks like a race condition is the
extract_autovac_opts() call that is done a little bit earlier,
also without lock.  I think this is actually safe, but it's ONLY
safe because we resisted the calls by certain people to add a
toast table to pg_class.  Otherwise, fetching reloptions could
have involved a toast pointer dereference, and it would then be
racy whether the toasted data was still there.  As-is, even if
the pg_class row we're looking at has been deleted, we can safely
disassemble its reloptions.  I think this matter is deserving
of a comment at least.

regards, tom lane




test runner (was Re: SQL-standard function body)

2021-04-08 Thread Andres Freund
Hi,

This started out as a reply to 
https://postgr.es/m/20210408170802.GA9392%40alvherre.pgsql
but it's independent enough to just start a new thread...

On 2021-04-08 13:08:02 -0400, Alvaro Herrera wrote:
> Yes, coverage.pg.org runs "make check-world".
>
> Maybe it would make sense to change that script, so that it runs the
> buildfarm's run_build.pl script instead of "make check-world".  That
> would make coverage.pg.org report what the buildfarm actually tests ...
> it would have made this problem a bit more obvious.

We desperately need to unify the different test run environments we
have. I did spent some time trying to do that, and ended up with it
being hard to do in a good way in the make / msvc environment. Not sure
that I took the right path, but I end up doing experimental port of the
buildsystem meson - which has a builtin test runner (working on all
platforms...).

andres@awork3:/tmp$ ccache --clear
andres@awork3:/tmp$ ~/src/meson/meson.py setup ~/src/postgresql 
/tmp/pg-meson --prefix=/tmp/pg-meson-install
The Meson build system
Version: 0.57.999
Source dir: /home/andres/src/postgresql
Build dir: /tmp/pg-meson
Build type: native build
Project name: postgresql
Project version: 14devel
...
Header  has symbol "fdatasync" : YES
Header  has symbol "F_FULLSYNC" : NO
Checking for alignment of "short" : 2
Checking for alignment of "int" : 4
...
Configuring pg_config_ext.h using configuration
Configuring pg_config.h using configuration
Configuring pg_config_paths.h using configuration
Program sed found: YES (/usr/bin/sed)
Build targets in project: 116

Found ninja-1.10.1 at /usr/bin/ninja
...

andres@awork3:/tmp/pg-meson$ time ninja
[10/1235] Generating snowball_create with a custom command
Generating tsearch script...
[41/1235] Generating generated_catalog_headers with a custom command
[1235/1235] Linking target contrib/test_decoding/test_decoding.so

real0m10.752s
user3m47.020s
sys 0m50.281s

...
andres@awork3:/tmp/pg-meson$ time ninja
[1/1] Generating test clean with a custom command

real0m0.085s
user0m0.068s
sys 0m0.016s
...

andres@awork3:/tmp/pg-meson$ time ~/src/meson/meson.py install --quiet
ninja: Entering directory `.'

real0m0.541s
user0m0.412s
sys 0m0.130s

...

andres@awork3:/tmp/pg-meson$ ninja test
[1/2] Running all tests.
 1/74 postgresql:setup / temp_install   
  OK0.52s
 2/74 postgresql:setup / cleanup_old
  OK0.01s
 3/74 postgresql:tap+pg_archivecleanup / 
pg_archivecleanup/t/010_pg_archivecleanup.pl OK0.29s   
42 subtests passed
 4/74 postgresql:tap+pg_checksums / pg_checksums/t/001_basic.pl 
  OK0.27s   8 subtests passed
 5/74 postgresql:tap+pg_config / pg_config/t/001_pg_config.pl   
  OK0.26s   20 subtests passed
...
68/74 postgresql:tap+pg_dump / pg_dump/t/002_pg_dump.pl 
  OK   28.26s   6408 subtests passed
...
74/74 postgresql:isolation / pg_isolation_regress   
  OK  114.91s


Ok: 74
Expected Fail:  0
Fail:   0
Unexpected Pass:0
Skipped:0
Timeout:0

Full log written to /tmp/pg-meson/meson-logs/testlog.txt


And in cases of failures it'll show the failure when it happens
(including the command to rerun just that test, without the harness in
between), and then a summary at the end:

61/74 postgresql:tap+pg_verifybackup / pg_verifybackup/t/003_corruption.pl  
  OK   10.65s   44 subtests passed
49/74 postgresql:tap+recovery / recovery/t/019_replslot_limit.pl
  ERROR 7.53s   exit status 1
>>> MALLOC_PERTURB_=16 
PATH=/tmp/pg-meson/tmp_install///usr/local/bin:/home/andres/bin/pg:/home/andres/bin/bin:/usr/sbin:/sbin:/home/andres/bin/pg:/home/andres/bin/bin:/usr/sbin:/sbin:/home/andres/bin/pg:/home/andres/bin/bin:/usr/sbin:/sbin:/home/andres/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/snap/bin
 PG_REGRESS=/tmp/pg-meson/src/test/regress/pg_regress 
REGRESS_SHLIB=/tmp/pg-meson/src/test/regress/regress.so 
LD_LIBRARY_PATH=/tmp/pg-meson/tmp_install///usr/local/lib/x86_64-linux-gnu 
/home/andres/src/postgresql/src/tools/testwrap /tmp/pg-meson recovery 
t/019_replslot_limit.pl perl -I /home/andres/src/postgresql/src/test/perl -I 
/home/andres/src/postgresql/src/test/recovery 
/home/andres/src/postgresql/src/test/recovery/t/019_replslot_limit.pl


Re: Have I found an interval arithmetic bug?

2021-04-08 Thread Bruce Momjian
On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> Well, bug or not, we are not going to change back branches for this, and
> if you want a larger discussion, it will have to wait for PG 15.
> 
> > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > « …field values can have fractional parts; for example '1.5 week' or 
> > > '01:02:03.45'. Such input is converted to the appropriate number of 
> > > months, days, and seconds for storage. When this would result in a 
> > > fractional number of months or days, the fraction is added to the 
> > > lower-order fields using the conversion factors 1 month = 30 days and 1 
> > > day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. 
> > > Only seconds will ever be shown as fractional on output. »
> 
> I see that.  What is not clear here is how far we flow down.  I was
> looking at adding documentation or regression tests for that, but was
> unsure.  I adjusted the docs slightly in the attached patch.

Here is an updated patch, which will be for PG 15.  It updates the
documentation to state:

The fractional parts are used to compute appropriate values for the next
lower-order internal fields (months, days, seconds).

It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

>From 0b5bf02e5baaa29e74fac9ac3823d593b6c5e3f2 Mon Sep 17 00:00:00 2001
From: Bruce Momjian 
Date: Thu, 8 Apr 2021 13:07:16 -0400
Subject: [PATCH] interval squash commit

---
 doc/src/sgml/datatype.sgml| 16 +++-
 src/backend/utils/adt/datetime.c  | 13 ++---
 src/interfaces/ecpg/pgtypeslib/interval.c | 12 ++--
 src/test/regress/expected/interval.out|  8 
 src/test/regress/sql/interval.sql |  2 +-
 5 files changed, 24 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 7c341c8e3f..9454107e75 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2770,15 +2770,13 @@ P  years-months-
 
 
 
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- '1.5 week' or '01:02:03.45'.  Such input is
- converted to the appropriate number of months, days, and seconds
- for storage.  When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, '1.5 month' becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts;  for example, '1.5
+ weeks' or '01:02:03.45'.  The fractional
+ parts are used to compute appropriate values for the next lower-order
+ internal fields (months, days, seconds).  For example, '1.5
+ months' becomes 1 month 15 days.
+ The fractional conversion factors used are 1 month = 30 days and 1 day
+ = 24 hours.  Only seconds will ever be shown as fractional on output.
 
 
 
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 889077f55c..d5b3705145 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -526,7 +526,6 @@ AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 	extra_days = (int) frac;
 	tm->tm_mday += extra_days;
 	frac -= extra_days;
-	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
 }
 
 /* Fetch a fractional-second value with suitable error checking */
@@ -3307,28 +3306,28 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 	case DTK_YEAR:
 		tm->tm_year += val;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
 		tmask = DTK_M(YEAR);
 		break;
 
 	case DTK_DECADE:
 		tm->tm_year += val * 10;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR * 10;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
 		tmask = DTK_M(DECADE);
 		break;
 
 	case DTK_CENTURY:
 		tm->tm_year += val * 100;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR * 100;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
 		tmask = DTK_M(CENTURY);
 		break;
 
 	case DTK_MILLENNIUM:
 		tm->tm_year += val * 1000;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR * 1000;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
 		tmask = DTK_M(MILLENNIUM);
 		break;
 
@@ -3565,7 +3564,7 @@ 

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, James Coleman wrote:

> On Thu, Apr 8, 2021 at 1:04 PM Alvaro Herrera  wrote:
> >
> > On 2021-Apr-08, James Coleman wrote:
> >
> > > I assume proper procedure for the CF entry is to move it into the
> > > current CF and then mark it as committed, however I don't know how (or
> > > don't have permissions?) to move it into the current CF. How does one
> > > go about doing that?
> > >
> > > Here's the entry: https://commitfest.postgresql.org/29/2542/
> >
> > Done, thanks.

> Thanks. Is that something I should be able to do myself

No, sorry.

-- 
Álvaro Herrera   Valdivia, Chile
"La vida es para el que se aventura"




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Simon Riggs wrote:

> On Thu, 8 Apr 2021 at 16:58, David Steele  wrote:

> > It's not clear to me which patch is which, so perhaps move one CF entry
> > to next CF and clarify which patch is current?
> 
> Entry: Maximize page freezing
> has this patch, perfectly fine, awaiting review from Alvaro since 29 Nov.
> one_freeze_then_max_freeze.v7.patch

Oh dear, was this waiting on me?  It was not in my to-do, so I didn't
pay close attention.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2021-04-08 Thread James Coleman
On Thu, Apr 8, 2021 at 1:04 PM Alvaro Herrera  wrote:
>
> On 2021-Apr-08, James Coleman wrote:
>
> > I assume proper procedure for the CF entry is to move it into the
> > current CF and then mark it as committed, however I don't know how (or
> > don't have permissions?) to move it into the current CF. How does one
> > go about doing that?
> >
> > Here's the entry: https://commitfest.postgresql.org/29/2542/
>
> Done, thanks.
>
> --
> Álvaro Herrera39°49'30"S 73°17'W

Thanks. Is that something I should be able to do myself (should I be
asking someone for getting privileges in the app to do so)? I'm not
sure what the project policy is on that.

James




Re: psql - add SHOW_ALL_RESULTS option

2021-04-08 Thread Julien Rouhaud
Bonjour Fabien,

On Thu, Apr 08, 2021 at 07:04:01PM +0200, Fabien COELHO wrote:
> > 
> > Thanks for the patch Fabien.  I've hit this issue multiple time and this is
> > indeed unwelcome.  Should we add it as an open item?
> 
> It is definitely a open item. I'm not sure where you want to add it…
> possibly the "Pg 14 Open Items" wiki page?

Correct.

> I tried but I do not have enough
> privileges, if you can do it please proceed. I added an entry in the next CF
> in the bugfix section.

That's strange, I don't think you need special permission there.  It's
working for me so I added an item with a link to the patch!




Re: SQL-standard function body

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Julien Rouhaud wrote:

> On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote:

> > No, because if that were the explanation then we'd be getting no
> > buildfarm coverage at all for for pg_stat_statements.  Which aside
> > from being awful contradicts the results at coverage.postgresql.org.
> 
> Is there any chance that coverage.postgresql.org isn't backed by the buildfarm
> client but a plain make check-world or something like that?

Yes, coverage.pg.org runs "make check-world".

Maybe it would make sense to change that script, so that it runs the
buildfarm's run_build.pl script instead of "make check-world".  That
would make coverage.pg.org report what the buildfarm actually tests ...
it would have made this problem a bit more obvious.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, James Coleman wrote:

> I assume proper procedure for the CF entry is to move it into the
> current CF and then mark it as committed, however I don't know how (or
> don't have permissions?) to move it into the current CF. How does one
> go about doing that?
> 
> Here's the entry: https://commitfest.postgresql.org/29/2542/

Done, thanks.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: psql - add SHOW_ALL_RESULTS option

2021-04-08 Thread Fabien COELHO


Bonjour Julien,


Attached a patch which attempts to fix this by moving the cancellation
cancelling request after processing results.


Thank you for your fixing. I tested and the problem has been solved 
after applying your patch.


Thanks for the patch Fabien.  I've hit this issue multiple time and this is
indeed unwelcome.  Should we add it as an open item?


It is definitely a open item. I'm not sure where you want to add it… 
possibly the "Pg 14 Open Items" wiki page? I tried but I do not have 
enough privileges, if you can do it please proceed. I added an entry in 
the next CF in the bugfix section.


--
Fabien.

Re: SQL-standard function body

2021-04-08 Thread Andrew Dunstan


On 4/8/21 2:40 AM, Julien Rouhaud wrote:
> On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote:
>> Hi,
>>
>> On 2021-04-08 02:05:25 -0400, Tom Lane wrote:
>>> So far the buildfarm seems to be turning green after b3ee4c503 ...
>>> so I wonder what extra condition is needed to cause the failure
>>> Andres is seeing.
>> Nothing special, really. Surprised the BF doesn't see it:
>>
>> andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf
>> force_parallel_mode=regress
>> andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && 
>> EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf' make -s -C 
>> contrib/pg_stat_statements/ check
>> All of PostgreSQL successfully made. Ready to install.
>> ...
>> The differences that caused some tests to fail can be viewed in the
>> file 
>> "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs".
>>   A copy of the test summary that you see
>> above is saved in the file 
>> "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out".
>> ...
> Is think this is because the buildfarm client is running installcheck for the
> contribs rather than check, and pg_stat_statements/Makefile has:
>
> # Disabled because these tests require 
> "shared_preload_libraries=pg_stat_statements",
> # which typical installcheck users do not have (e.g. buildfarm clients).
> NO_INSTALLCHECK = 1
>
>
>


Yeah, Julien is right, we run "make check" for these in src/test/modules
but I missed contrib. I have fixed this on crake so we get some
immediate coverage and a fix will be pushed to github shortly.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-04-08 Thread Bruce Momjian
On Thu, Apr  8, 2021 at 12:51:06PM -0400, Álvaro Herrera wrote:
> On 2021-Apr-08, Bruce Momjian wrote:
> 
> > pg_stat_activity.queryid is new, but I can imagine cases where you would
> > join pg_stat_activity to pg_stat_statements to get an estimate of how
> > long the query will take --- having one using an underscore and another
> > one not seems odd.
> 
> OK.  So far, you have one vote for queryid (your own) and two votes for
> query_id (mine and Julien's).  And even yourself were hesitating about
> it earlier in the thread.

OK, if people are fine with pg_stat_activity.query_id not matching
pg_stat_statements.queryid, I am fine with that.  I just don't want
someone to say it was a big mistake later.  ;-)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2021-04-08 Thread James Coleman
On Thu, Apr 8, 2021 at 8:01 AM David Rowley  wrote:
>
> On Thu, 8 Apr 2021 at 22:54, David Rowley  wrote:
> >
> > I think the changes in the patch are fairly isolated and the test
> > coverage is now pretty good.  I'm planning on looking at the patch
> > again now and will consider pushing it for PG14.
>
> I push this with some minor cleanup from the v6 patch I posted earlier.
>
> David

Thank you!

I assume proper procedure for the CF entry is to move it into the
current CF and then mark it as committed, however I don't know how (or
don't have permissions?) to move it into the current CF. How does one
go about doing that?

Here's the entry: https://commitfest.postgresql.org/29/2542/

Thanks,
James




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Bruce Momjian wrote:

> pg_stat_activity.queryid is new, but I can imagine cases where you would
> join pg_stat_activity to pg_stat_statements to get an estimate of how
> long the query will take --- having one using an underscore and another
> one not seems odd.

OK.  So far, you have one vote for queryid (your own) and two votes for
query_id (mine and Julien's).  And even yourself were hesitating about
it earlier in the thread.

-- 
Álvaro Herrera   Valdivia, Chile




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread Simon Riggs
On Thu, 8 Apr 2021 at 17:44, David Steele  wrote:
>
> On 4/8/21 12:29 PM, Simon Riggs wrote:
> > On Thu, 8 Apr 2021 at 16:58, David Steele  wrote:
> >
>  It has been five months since this patch was updated, so marking
>  Returned with Feedback.
> 
>  Please resubmit to the next CF when you have a new patch.
> >>>
> >>> There are 2 separate patch-sets on this thread, with separate CF entries.
> >>>
> >>> One has requested changes that have not been made. I presume this one
> >>> has been RwF'd.
> >>>
> >>> What is happening about the other patch?
> >>
> >> Hmmm, well https://commitfest.postgresql.org/32/2908 and
> >> https://commitfest.postgresql.org/32/2909 are both linked to the same
> >> thread with the same patch, so I thought it was a duplicate.
> >
> > Nobody had mentioned any confusion. Multiple patches on one thread is 
> > common.
>
> Yes, but these days they generally get updated in the same reply so the
> cfbot can test them all. In your case only the latest patch was being
> tested and it was not applying cleanly.
>
> >> It's not clear to me which patch is which, so perhaps move one CF entry
> >> to next CF and clarify which patch is current?
> >
> > Entry: Maximize page freezing
> > has this patch, perfectly fine, awaiting review from Alvaro since 29 Nov.
> > one_freeze_then_max_freeze.v7.patch
>
> That CF entry was marked Waiting for Author on 3/11, so I thought it was
> for this patch. In fact, both CF entries were Waiting for Author for
> some time.

That was done by someone that hadn't mentioned it to me, or the list.

It is not Waiting on Author.

> Moved to the next CF in Waiting for Author state.

That is clearly an incorrect state.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-04-08 Thread Bruce Momjian
On Fri, Apr  9, 2021 at 12:38:29AM +0800, Julien Rouhaud wrote:
> On Thu, Apr 08, 2021 at 11:34:25AM -0400, Bruce Momjian wrote:
> > 
> > OK, let's get some details.  First, pg_stat_statements.queryid already
> > exists (no underscore), and I don't think anyone wants to change that. 
> > 
> > pg_stat_activity.queryid is new, but I can imagine cases where you would
> > join pg_stat_activity to pg_stat_statements to get an estimate of how
> > long the query will take --- having one using an underscore and another
> > one not seems odd.
> 
> Indeed, and also being able to join with a USING clause rather than an ON 
> could
> also save some keystrokes.  But unfortunately, we already have (userid, dbid)
> on pg_stat_statements side vs (usesysid, datid) on pg_stat_activity side, so
> this unfortunately won't fix all the oddities.

Wow, good point.  Shame they don't match.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread David Steele

On 4/8/21 12:29 PM, Simon Riggs wrote:

On Thu, 8 Apr 2021 at 16:58, David Steele  wrote:


It has been five months since this patch was updated, so marking
Returned with Feedback.

Please resubmit to the next CF when you have a new patch.


There are 2 separate patch-sets on this thread, with separate CF entries.

One has requested changes that have not been made. I presume this one
has been RwF'd.

What is happening about the other patch?


Hmmm, well https://commitfest.postgresql.org/32/2908 and
https://commitfest.postgresql.org/32/2909 are both linked to the same
thread with the same patch, so I thought it was a duplicate.


Nobody had mentioned any confusion. Multiple patches on one thread is common.


Yes, but these days they generally get updated in the same reply so the 
cfbot can test them all. In your case only the latest patch was being 
tested and it was not applying cleanly.



It's not clear to me which patch is which, so perhaps move one CF entry
to next CF and clarify which patch is current?


Entry: Maximize page freezing
has this patch, perfectly fine, awaiting review from Alvaro since 29 Nov.
one_freeze_then_max_freeze.v7.patch


That CF entry was marked Waiting for Author on 3/11, so I thought it was 
for this patch. In fact, both CF entries were Waiting for Author for 
some time.


Moved to the next CF in Waiting for Author state.

Regards,
--
-David
da...@pgmasters.net




Re: doc review for v14

2021-04-08 Thread Justin Pryzby
Another round of doc review, not yet including all of yesterday's commits.

29c8d614c3 duplicate words
diff --git a/src/include/lib/sort_template.h b/src/include/lib/sort_template.h
index 771c789ced..24d6d0006c 100644
--- a/src/include/lib/sort_template.h
+++ b/src/include/lib/sort_template.h
@@ -241,7 +241,7 @@ ST_SCOPE void ST_SORT(ST_ELEMENT_TYPE *first, size_t n
 
 /*
  * Find the median of three values.  Currently, performance seems to be best
- * if the the comparator is inlined here, but the med3 function is not inlined
+ * if the comparator is inlined here, but the med3 function is not inlined
  * in the qsort function.
  */
 static pg_noinline ST_ELEMENT_TYPE *
e7c370c7c5 pg_amcheck: remove Double semi-colon
diff --git a/src/bin/pg_amcheck/t/004_verify_heapam.pl 
b/src/bin/pg_amcheck/t/004_verify_heapam.pl
index 36607596b1..2171d236a7 100644
--- a/src/bin/pg_amcheck/t/004_verify_heapam.pl
+++ b/src/bin/pg_amcheck/t/004_verify_heapam.pl
@@ -175,7 +175,7 @@ sub write_tuple
seek($fh, $offset, 0)
or BAIL_OUT("seek failed: $!");
defined(syswrite($fh, $buffer, HEAPTUPLE_PACK_LENGTH))
-   or BAIL_OUT("syswrite failed: $!");;
+   or BAIL_OUT("syswrite failed: $!");
return;
 }
 
b745e9e60e a statistics objects
diff --git a/src/backend/statistics/extended_stats.c 
b/src/backend/statistics/extended_stats.c
index 463d44a68a..4674168ff8 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -254,7 +254,7 @@ BuildRelationExtStatistics(Relation onerel, double 
totalrows,
  * that would require additional columns.
  *
  * See statext_compute_stattarget for details about how we compute statistics
- * target for a statistics objects (from the object target, attribute targets
+ * target for a statistics object (from the object target, attribute targets
  * and default statistics target).
  */
 int
e7d5c5d9dc guc.h: remove mention of "doit"
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 1892c7927b..1126b34798 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -90,8 +90,7 @@ typedef enum
  * dividing line between "interactive" and "non-interactive" sources for
  * error reporting purposes.
  *
- * PGC_S_TEST is used when testing values to be used later ("doit" will always
- * be false, so this never gets stored as the actual source of any value).
+ * PGC_S_TEST is used when testing values to be used later.
  * For example, ALTER DATABASE/ROLE tests proposed per-database or per-user
  * defaults this way, and CREATE FUNCTION tests proposed function SET clauses
  * this way.  This is an interactive case, but it needs its own source value
ad5f9a2023 Caller
diff --git a/src/backend/utils/adt/jsonfuncs.c 
b/src/backend/utils/adt/jsonfuncs.c
index 9961d27df4..09fcff6729 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -1651,7 +1651,7 @@ push_null_elements(JsonbParseState **ps, int num)
  * this path. E.g. the path [a][0][b] with the new value 1 will produce the
  * structure {a: [{b: 1}]}.
  *
- * Called is responsible to make sure such path does not exist yet.
+ * Caller is responsible to make sure such path does not exist yet.
  */
 static void
 push_path(JsonbParseState **st, int level, Datum *path_elems,
@@ -4887,7 +4887,7 @@ IteratorConcat(JsonbIterator **it1, JsonbIterator **it2,
  * than just one last element, this flag will instruct to create the whole
  * chain of corresponding objects and insert the value.
  *
- * JB_PATH_CONSISTENT_POSITION for an array indicates that the called wants to
+ * JB_PATH_CONSISTENT_POSITION for an array indicates that the caller wants to
  * keep values with fixed indices. Indices for existing elements could be
  * changed (shifted forward) in case if the array is prepended with a new value
  * and a negative index out of the range, so this behavior will be prevented
9acedbd4af as
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 20e7d57d41..40a54ad0bd 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -410,7 +410,7 @@ CopyMultiInsertBufferCleanup(CopyMultiInsertInfo *miinfo,
  * Once flushed we also trim the tracked buffers list down to size by removing
  * the buffers created earliest first.
  *
- * Callers should pass 'curr_rri' is the ResultRelInfo that's currently being
+ * Callers should pass 'curr_rri' as the ResultRelInfo that's currently being
  * used.  When cleaning up old buffers we'll never remove the one for
  * 'curr_rri'.
  */
9f78de5042 exist
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 5bdaceefd5..182a133033 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -617,7 +617,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 *
 * We assume that VACUUM hasn't set pg_class.reltuples already, even
 * during a 

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-04-08 Thread Julien Rouhaud
On Thu, Apr 08, 2021 at 11:34:25AM -0400, Bruce Momjian wrote:
> 
> OK, let's get some details.  First, pg_stat_statements.queryid already
> exists (no underscore), and I don't think anyone wants to change that. 
> 
> pg_stat_activity.queryid is new, but I can imagine cases where you would
> join pg_stat_activity to pg_stat_statements to get an estimate of how
> long the query will take --- having one using an underscore and another
> one not seems odd.

Indeed, and also being able to join with a USING clause rather than an ON could
also save some keystrokes.  But unfortunately, we already have (userid, dbid)
on pg_stat_statements side vs (usesysid, datid) on pg_stat_activity side, so
this unfortunately won't fix all the oddities.




Re: psql - add SHOW_ALL_RESULTS option

2021-04-08 Thread Julien Rouhaud
On Thu, Apr 08, 2021 at 01:32:12AM +, shiy.f...@fujitsu.com wrote:
> > Attached a patch which attempts to fix this by moving the cancellation
> > cancelling request after processing results.
> 
> Thank you for your fixing. I tested and the problem has been solved after 
> applying your patch.

Thanks for the patch Fabien.  I've hit this issue multiple time and this is
indeed unwelcome.  Should we add it as an open item?




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread Simon Riggs
On Thu, 8 Apr 2021 at 16:58, David Steele  wrote:

> >> It has been five months since this patch was updated, so marking
> >> Returned with Feedback.
> >>
> >> Please resubmit to the next CF when you have a new patch.
> >
> > There are 2 separate patch-sets on this thread, with separate CF entries.
> >
> > One has requested changes that have not been made. I presume this one
> > has been RwF'd.
> >
> > What is happening about the other patch?
>
> Hmmm, well https://commitfest.postgresql.org/32/2908 and
> https://commitfest.postgresql.org/32/2909 are both linked to the same
> thread with the same patch, so I thought it was a duplicate.

Nobody had mentioned any confusion. Multiple patches on one thread is common.

> It's not clear to me which patch is which, so perhaps move one CF entry
> to next CF and clarify which patch is current?

Entry: Maximize page freezing
has this patch, perfectly fine, awaiting review from Alvaro since 29 Nov.
one_freeze_then_max_freeze.v7.patch

Other patch is awaiting changes.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: SQL-standard function body

2021-04-08 Thread Tom Lane
Julien Rouhaud  writes:
> On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote:
>> No, because if that were the explanation then we'd be getting no
>> buildfarm coverage at all for for pg_stat_statements.  Which aside
>> from being awful contradicts the results at coverage.postgresql.org.

> Is there any chance that coverage.postgresql.org isn't backed by the buildfarm
> client but a plain make check-world or something like that?

Hmm, I think you're right.  Poking around in the log files from one
of my own buildfarm animals, there's no evidence that pg_stat_statements
is being tested at all.  Needless to say, that's just horrid :-(

I see that contrib/test_decoding also sets NO_INSTALLCHECK = 1,
and the reason it gets tested is that the buildfarm script has
a special module for that.  I guess we need to clone that module,
or maybe better, find a way to generalize it.

There are also some src/test/modules modules that set NO_INSTALLCHECK,
but apparently those do have coverage (modules-check is the step that
runs their SQL tests, and then the TAP tests if any get broken out
as separate buildfarm steps).

regards, tom lane




Re: Autovacuum on partitioned table (autoanalyze)

2021-04-08 Thread Alvaro Herrera
On 2021-Apr-08, Tomas Vondra wrote:

> On 4/8/21 5:27 PM, Alvaro Herrera wrote:
>
> > Same as for any other relation: ANALYZE would set it, after it's done
> > scanning the table.  We would to make sure that nothing resets it to
> > empty, though, and that it doesn't cause issues elsewhere.  (The patch I
> > sent contains the minimal change to make it work, but of course that's
> > missing having other pieces of code maintain it.)
> 
> So ANALYZE would inspect the child relations, sum the reltuples and set
> it for the parent? IMO that'd be problematic because it'd mean we're
> comparing the current number of changes with reltuples value which may
> be arbitrarily stale (if we haven't analyzed the parent for a while).

What?  Not at all.  reltuples would be set by ANALYZE on one run, and
then the value is available for the future autovacuum run.  That's how
it works for regular tables too, so I'm not sure what you problem have
with that.  The (possibly stale) reltuples value is multiplied by the
scale factor, and added to the analyze_threshold value, and that's
compared with the current changes_since_analyze to determine whether to
analyze or not.

> That's essentially the issue I described when explaining why I think the
> code needs to propagate the changes, reread the stats and then evaluate
> which relations need vacuuming. It's similar to the issue of comparing
> old changes_since_analyze vs. current reltuples, which is why the code
> is rereading the stats before checking the thresholds. This time it's
> the opposite direction - the reltuples might be stale.

Well, I don't think the issue is the same.  reltuples is always stale,
even for regular tables, because that's just how it works.
changes_since_analyze is not stale for regular tables, and that's why it
makes sense to propagate it from partitions to ancestors prior to
checking the analyze condition.

> FWIW I think the current refresh logic is not quite correct, because
> autovac_refresh_stats does some throttling (STATS_READ_DELAY). It
> probably needs a "force" parameter to ensure it actually reads the
> current stats in this one case.

Hmm ... good catch, but actually that throttling only applies to the
launcher.  do_autovacuum runs in the worker, so there's no throttling.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread David Steele

On 4/8/21 11:41 AM, Simon Riggs wrote:

On Thu, 8 Apr 2021 at 16:23, David Steele  wrote:


On 3/17/21 4:50 PM, Simon Riggs wrote:

On Fri, 12 Mar 2021 at 22:16, Tomas Vondra
 wrote:


On 1/28/21 2:33 PM, Simon Riggs wrote:

On Thu, 28 Jan 2021 at 12:53, Masahiko Sawada  wrote:


This entry has been "Waiting on Author" status and the patch has not
been updated since Nov 30. Are you still planning to work on this?


Yes, new patch version tomorrow. Thanks for the nudge and the review.



So, is it tomorrow already? ;-)


Been a long day. ;-)


It has been five months since this patch was updated, so marking
Returned with Feedback.

Please resubmit to the next CF when you have a new patch.


There are 2 separate patch-sets on this thread, with separate CF entries.

One has requested changes that have not been made. I presume this one
has been RwF'd.

What is happening about the other patch?


Hmmm, well https://commitfest.postgresql.org/32/2908 and 
https://commitfest.postgresql.org/32/2909 are both linked to the same 
thread with the same patch, so I thought it was a duplicate.


It's not clear to me which patch is which, so perhaps move one CF entry 
to next CF and clarify which patch is current?


Regards,
--
-David
da...@pgmasters.net




Re: Autovacuum on partitioned table (autoanalyze)

2021-04-08 Thread Tomas Vondra
On 4/8/21 5:27 PM, Alvaro Herrera wrote:
> On 2021-Apr-08, Tomas Vondra wrote:
> 
>> On 4/8/21 5:22 AM, Alvaro Herrera wrote:
> 
>>> However, I just noticed there is a huge problem, which is that the new
>>> code in relation_needs_vacanalyze() is doing find_all_inheritors(), and
>>> we don't necessarily have a snapshot that lets us do that.  While adding
>>> a snapshot acquisition at that spot is a very easy fix, I hesitate to
>>> fix it that way, because the whole idea there seems quite wasteful: we
>>> have to look up, open and lock every single partition, on every single
>>> autovacuum iteration through the database.  That seems bad.  I'm
>>> inclined to think that a better idea may be to store reltuples for the
>>> partitioned table in pg_class.reltuples, instead of having to add up the
>>> reltuples of each partition.  I haven't checked if this is likely to
>>> break anything.
>>
>> How would that value get updated, for the parent?
> 
> Same as for any other relation: ANALYZE would set it, after it's done
> scanning the table.  We would to make sure that nothing resets it to
> empty, though, and that it doesn't cause issues elsewhere.  (The patch I
> sent contains the minimal change to make it work, but of course that's
> missing having other pieces of code maintain it.)
> 

So ANALYZE would inspect the child relations, sum the reltuples and set
it for the parent? IMO that'd be problematic because it'd mean we're
comparing the current number of changes with reltuples value which may
be arbitrarily stale (if we haven't analyzed the parent for a while).

That's essentially the issue I described when explaining why I think the
code needs to propagate the changes, reread the stats and then evaluate
which relations need vacuuming. It's similar to the issue of comparing
old changes_since_analyze vs. current reltuples, which is why the code
is rereading the stats before checking the thresholds. This time it's
the opposite direction - the reltuples might be stale.

FWIW I think the current refresh logic is not quite correct, because
autovac_refresh_stats does some throttling (STATS_READ_DELAY). It
probably needs a "force" parameter to ensure it actually reads the
current stats in this one case.

>>> (Also, a minor buglet: if we do ANALYZE (col1), then ANALYZE (col2) a
>>> partition, then we repeatedly propagate the counts to the parent table,
>>> so we would cause the parent to be analyzed more times than it should.
>>> Sounds like we should not send the ancestor list when a column list is
>>> given to manual analyze.  I haven't verified this, however.)
>>
>> Are you sure? I haven't tried, but shouldn't this be prevented by only
>> sending the delta between the current and last reported value?
> 
> I did try, and yes it behaves as you say.
> 

OK, good.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: VACUUM (DISABLE_PAGE_SKIPPING on)

2021-04-08 Thread Simon Riggs
On Thu, 8 Apr 2021 at 16:23, David Steele  wrote:
>
> On 3/17/21 4:50 PM, Simon Riggs wrote:
> > On Fri, 12 Mar 2021 at 22:16, Tomas Vondra
> >  wrote:
> >>
> >> On 1/28/21 2:33 PM, Simon Riggs wrote:
> >>> On Thu, 28 Jan 2021 at 12:53, Masahiko Sawada  
> >>> wrote:
> >>>
>  This entry has been "Waiting on Author" status and the patch has not
>  been updated since Nov 30. Are you still planning to work on this?
> >>>
> >>> Yes, new patch version tomorrow. Thanks for the nudge and the review.
> >>>
> >>
> >> So, is it tomorrow already? ;-)
> >
> > Been a long day. ;-)
>
> It has been five months since this patch was updated, so marking
> Returned with Feedback.
>
> Please resubmit to the next CF when you have a new patch.

There are 2 separate patch-sets on this thread, with separate CF entries.

One has requested changes that have not been made. I presume this one
has been RwF'd.

What is happening about the other patch?

-- 
Simon Riggshttp://www.EnterpriseDB.com/




  1   2   >