Re: shared-memory based stats collector

2021-03-08 Thread Kyotaro Horiguchi
At Sat, 6 Mar 2021 00:32:07 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/03/05 17:18, Kyotaro Horiguchi wrote:
> > At Thu, 21 Jan 2021 12:03:48 +0900 (JST), Kyotaro Horiguchi
> >  wrote in
> >> Commit 960869da08 (database statistics) conflicted with this. Rebased.
> >>
> >> I'm concerned about the behavior that pgstat_update_connstats calls
> >> GetCurrentTimestamp() every time stats update happens (with intervals
> >> of 10s-60s in this patch). But I didn't change that design since that
> >> happens with about 0.5s intervals in master and the rate is largely
> >> reduced in this patch, to make this patch simpler.
> > I stepped on my foot, and another commit coflicted. Just rebased.
> 
> Thanks for rebasing the patches!
> 
> I think that 0003 patch is self-contained and useful, for example
> which
> enables us to monitor archiver process in pg_stat_activity. So IMO
> it's worth pusing 0003 patch firstly.

I'm not sure archiver process is worth realtime monitoring, but I
agree that the patch makes it possible.  Anyway it is requried in this
patchset and I'm happy to see it committed beforehand.

Thanks for the review.

> Here are the review comments for 0003 patch.
> 
> + /* Archiver process's latch */
> + Latch  *archiverLatch;
> + /* Current shared estimate of appropriate spins_per_delay value */
> 
> The last line in the above seems not necessary.

Oops. It seems like a garbage after a past rebasing. Removed.

> In proc.h, NUM_AUXILIARY_PROCS needs to be incremented.

Right.  Increased to 5 and rewrote the comment.

> /* --
>  * Functions called from postmaster
>  * --
>  */
> extern intpgarch_start(void);
> 
> In pgarch.h, the above is not necessary.

Removed.

> +extern void XLogArchiveWakeup(void);
> 
> This seems no longer necessary.
> 
> +extern void XLogArchiveWakeupStart(void);
> +extern void XLogArchiveWakeupEnd(void);
> +extern void XLogArchiveWakeup(void);
> 
> These seem also no longer necessary.

Sorry for many garbages. Removed all of them.

>   PgArchPID = 0;
>   if (!EXIT_STATUS_0(exitstatus))
> - LogChildExit(LOG, _("archiver process"),
> -  pid, exitstatus);
> - if (PgArchStartupAllowed())
> - PgArchPID = pgarch_start();
> + HandleChildCrash(pid, exitstatus,
> +  _("archiver 
> process"));
> 
> I don't think that we should treat non-zero exit condition as a crash,
> as before. Otherwise when archive_command fails on a signal,
> archiver emits FATAL error and which leads the server restart.

Sounds reasonable. Now archiver is treated the same way to wal
receiver.  Specifically exit(1) doesn't cause server restart.

> - * walwriter, autovacuum, or background worker.
> + * walwriter, autovacuum, archiver or background worker.
>   *
>   * The objectives here are to clean up our local state about the child
>   * process, and to signal all other remaining children to quickdie.
> @@ -3609,6 +3606,18 @@ HandleChildCrash(int pid, int exitstatus, const
> char *procname)
>   signal_child(AutoVacPID, (SendStop ? SIGSTOP : SIGQUIT));
>   }
>  +/* Take care of the archiver too */
> + if (pid == PgArchPID)
> + PgArchPID = 0;
> + else if (PgArchPID != 0 && take_action)
> + {
> + ereport(DEBUG2,
> + (errmsg_internal("sending %s to process %d",
> +  (SendStop ? 
> "SIGSTOP" : "SIGQUIT"),
> +  (int) 
> PgArchPID)));
> + signal_child(PgArchPID, (SendStop ? SIGSTOP : SIGQUIT));
> + }
> +
> 
> Same as above.

Mmm.  In the first place, I found that I forgot to remove existing
code to handle archiver... Removed it instead of the above, which is
added by this patch.  Since the process becomes an auxiliary process,
no reason to differentiate from other auxiliary processes in handling?

> In xlogarchive.c, "#include "storage/pmsignal.h"" is no longer
> necessary.

Removed.

> pgarch_forkexec() should be removed from pgarch.c because it's no
> longer used.

Right. Removed. EXEC_BACKEND still fails for another reason of
prototype mismatch of PgArchiverMain. Fixed it toghether.

> /* 
>  * Public functions called from postmaster follow
>  * 
>  */
> 
> The definition of PgArchiverMain() should be placed just
> after the above comment.

The module no longer have a function called from postmaster. Now
PgArchiverMain() is placed just below "/* Main entry point...".

> exit(0) in PgArchiverMain() should be proc_exit(0)?

Yeah, proc_exit() says as it should be 

Re: [HACKERS] Custom compression methods

2021-03-08 Thread Justin Pryzby
On Tue, Mar 09, 2021 at 01:04:10PM +0530, Dilip Kumar wrote:
> On Tue, Mar 9, 2021 at 2:45 AM Robert Haas  wrote:
> >
> > On Mon, Mar 8, 2021 at 3:59 PM Justin Pryzby  wrote:
> > > > It would be nice to have a way to force
> > > > anything compressed with the old method to be re-compressed with the
> > > > new method, but not having that doesn't preclude allowing the
> > > > parameter to be changed.
> > >
> > > Doesn't vacuum full/cluster/dump+restore do that ?
> >
> > Well, dump and restore will do it, certainly, but I don't think VACUUM
> > FULL or CLUSTER will. I haven't tested it, though, so maybe I'm wrong.
> 
> Yeah, vacuum full or cluster will not re-compress the data.  How about
> providing syntax ALTER TABLE  ALTER COLUMN  SET
> COMPRESSION  REWRITE?

It'd be strange to me if "rewrite" were associated with a column.

Depending on what data strucutures you use, you might accidentally fail to
rewrite the table if someone wrote something like:
postgres=# alter table t alter a set compression pglz REWRITE, alter a set 
compression pglz;

Although I hope that's something to support someday for columnar AMs, I think
table-rewriting now is done in entirety, and the syntax should reflect that.

-- 
Justin




Re: [HACKERS] Custom compression methods

2021-03-08 Thread Dilip Kumar
On Tue, Mar 9, 2021 at 2:45 AM Robert Haas  wrote:
>
> On Mon, Mar 8, 2021 at 3:59 PM Justin Pryzby  wrote:
> > > It would be nice to have a way to force
> > > anything compressed with the old method to be re-compressed with the
> > > new method, but not having that doesn't preclude allowing the
> > > parameter to be changed.
> >
> > Doesn't vacuum full/cluster/dump+restore do that ?
>
> Well, dump and restore will do it, certainly, but I don't think VACUUM
> FULL or CLUSTER will. I haven't tested it, though, so maybe I'm wrong.

Yeah, vacuum full or cluster will not re-compress the data.  How about
providing syntax ALTER TABLE  ALTER COLUMN  SET
COMPRESSION  REWRITE?  So if we have given a rewrite then we
will always rewrite the table and in an attempt to rewrite we will
re-compress the data.  If REWRITE is given and the compression method
is the same as the existing then also we can not skip the rewrite
because we don't know the history, the user might alter the
compression method multiple times without rewrite.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Pavel Stehule
út 9. 3. 2021 v 7:57 odesílatel Joel Jacobson  napsal:

> On Mon, Mar 8, 2021, at 21:46, Pavel Stehule wrote:
>
> so what about?
>
> CREATE OR REPLACE FUNCTION unnest_slice(anyarray, int)
> RETURNS SETOF anyarray AS $$
> DECLARE r $1%type;
> BEGIN
>   FOREACH r SLICE $2 IN ARRAY $1 --- now $2 should be constant
>   LOOP
> RETURN NEXT r;
>   END LOOP;
> END;
> $$ LANGUAGE plpgsql;
>
>
> Not sure I understand. Is the suggestion to add "SLICE" as syntactic sugar
> in PL/pgSQL to invoke the proposed two-argument C-version of unnest()?
>

there are two ideas:

1. the behaviour can be same like SLICE clause of FOREACH statement

2. use unnest_slice as name - the function "unnest" is relatively rich
today and using other overloading doesn't look too practical. But this is
just an idea. I can imagine more forms of slicing or unnesting, so it can
be practical to use different names than just "unnest".

Personally I don't like too much using 2D arrays for this purpose. The
queries over this functionality will be harder to read (it is like fortran
77). I understand so now, there is no other possibility, because pg cannot
build array type from function signature. So it is harder to build an array
of record types.

We can make an easy tuple store of records - like FUNCTION fx(OUT a int,
OUT b int) RETURNS SETOF RECORD. But now, thanks to Tom and Amit's work,
the simple expression evaluation is significantly  faster than SQL
evaluation. So using any SRF function has performance impact. What I miss
is the possibility to write functions like FUNCTION fx(OUT a int, OUT b
int) RETURNS ARRAY. With this possibility is easy to write functions that
you need, and is not necessary to use 2d arrays. If the result of regexp
functions will be arrays of records, then a new unnest function is not
necessary. So this is not a good direction. Instead of fixing core issues,
we design workarounds. There can be more wide usage of arrays of composites.

Regards

Pavel






> /Joel
>
>
>


Re: pg_upgrade fails with non-standard ACL

2021-03-08 Thread Noah Misch
On Thu, Feb 11, 2021 at 08:16:30PM +0300, Anastasia Lubennikova wrote:
> On 28.01.2021 09:55, Noah Misch wrote:
> >On Wed, Jan 27, 2021 at 07:32:42PM +0300, Anastasia Lubennikova wrote:
> >>On 27.01.2021 14:21, Noah Misch wrote:
> >>>On Mon, Jan 25, 2021 at 10:14:43PM +0300, Anastasia Lubennikova wrote:
> 1) Could you please clarify, what do you mean by REVOKE failures?
> 
> I tried following example:
> 
> Start 9.6 cluster.
> 
> REVOKE ALL ON function pg_switch_xlog() FROM public;
> REVOKE ALL ON function pg_switch_xlog() FROM backup;
> 
> The upgrade to the current master passes with and without patch.
> It seems that current implementation of pg_upgrade doesn't take into 
> account
> revoke ACLs.
> >>>I think you can observe it by adding "revoke all on function
> >>>pg_stat_get_subscription from public;" to 
> >>>test_add_acl_to_catalog_objects.sql
> >>>and then rerunning your test script.  pg_dump will reproduce that REVOKE,
> >>>which would fail if postgresql.org had removed that function.  That's 
> >>>fine, so
> >>>long as a comment mentions the limitation.

I've now tested exactly that.  It didn't cause a test script failure, because
pg_upgrade_ACL_test.sh runs "pg_upgrade --check" but does not run the final
pg_upgrade (without --check).  The failure happens only without --check.  I've
attached a modified version of your test script that reproduces the problem.
It uses a different function, timenow(), so the test does not depend on
test_rename_catalog_objects_v14.  The attached script fails with this output:

===
Consult the last few lines of "pg_upgrade_dump_13325.log" for
the probable cause of the failure.
Failure, exiting
===

That log file contains:

===
command: "/tmp/workdir/postgresql_bin_test_new/bin/pg_dump" --host 
/home/nm/src/pg/backbranch/extra --port 50432 --username nm --schema-only 
--quote-all-identifiers --binary-upgrade --format=custom   
--file="pg_upgrade_dump_13325.custom" 'dbname=postgres' >> 
"pg_upgrade_dump_13325.log" 2>&1


command: "/tmp/workdir/postgresql_bin_test_new/bin/pg_restore" --host 
/home/nm/src/pg/backbranch/extra --port 50432 --username nm --clean --create 
--exit-on-error --verbose --dbname template1 "pg_upgrade_dump_13325.custom" >> 
"pg_upgrade_dump_13325.log" 2>&1
pg_restore: connecting to database for restore
pg_restore: dropping DATABASE PROPERTIES postgres
pg_restore: dropping DATABASE postgres
pg_restore: creating DATABASE "postgres"
pg_restore: connecting to new database "postgres"
pg_restore: creating COMMENT "DATABASE "postgres""
pg_restore: creating DATABASE PROPERTIES "postgres"
pg_restore: connecting to new database "postgres"
pg_restore: creating pg_largeobject "pg_largeobject"
pg_restore: creating ACL "pg_catalog.FUNCTION "timenow"()"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 3047; 0 0 ACL FUNCTION "timenow"() nm
pg_restore: error: could not execute query: ERROR:  function 
pg_catalog.timenow() does not exist
Command was: REVOKE ALL ON FUNCTION "pg_catalog"."timenow"() FROM PUBLIC;
===

> >>In the updated patch, I implemented generation of both GRANT ALL and REVOKE
> >>ALL for problematic objects. If I understand it correctly, these calls will
> >>clean object's ACL completely. And I see no harm in doing this, because the
> >>objects don't exist in the new cluster anyway.

Here's fix_system_objects_ACL.sql with your v14 test script:

===
\connect postgres
 GRANT ALL  ON function  pg_catalog.pg_stat_get_subscription(pg_catalog.oid)  
TO "backup" ;
 REVOKE ALL  ON function  pg_catalog.pg_stat_get_subscription(pg_catalog.oid)  
FROM "backup" ;
 GRANT ALL  ON pg_catalog.pg_stat_subscription  TO "backup" ;
 REVOKE ALL  ON pg_catalog.pg_stat_subscription  FROM "backup" ;
 GRANT ALL  ON pg_catalog.pg_subscription  TO "backup","test" ;
 REVOKE ALL  ON pg_catalog.pg_subscription  FROM "backup","test" ;
 GRANT ALL  (subenabled)  ON pg_catalog.pg_subscription  TO "backup" ;
 REVOKE ALL  (subenabled)  ON pg_catalog.pg_subscription  FROM "backup" ;
===

Considering the REVOKE statements, those new GRANT statements have no effect.
To prevent the final pg_upgrade failure, fix_system_objects_ACL.sql would need
"GRANT ALL ON FUNCTION pg_catalog.pg_stat_get_subscription(pg_catalog.oid) TO
PUBLIC;".  Alternately, again, I don't mind if this case continues to fail, so
long as a comment mentions the limitation.  How would you like to proceed?

One other thing:

> diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
> index 43fc297eb6..f2d593f574 100644
> --- a/src/bin/pg_upgrade/check.c
> +++ b/src/bin/pg_upgrade/check.c
...
> +check_for_changed_signatures(void)
> +{
...
> + /*
> +  *
> +  * AclInfo array is sorted by obj_ident. This allows us to 
> compare
> +  * AclInfo entries with the query result above efficiently.
> +  */
> + for (aclnum = 0; aclnum < dbinfo->non_def_acl_arr.nacls; 
> aclnum++)
> +

RE: [PoC] Non-volatile WAL buffer

2021-03-08 Thread tsunakawa.ta...@fujitsu.com
From: Takashi Menjo 
> > The other question is whether simply placing WAL on DAX (without any
> > code changes) is safe. If it's not, then all the "speedups" are
> > computed with respect to unsafe configuration and so are useless. And
> > BTT should be used instead, which would of course produce very different
> results.
> 
> I think it's safe, thanks to the checksum in the header of WAL record (xl_crc 
> in
> struct XLogRecord). In DAX mode, user data (WAL record
> here) is written to the PMEM device by a smaller unit (probably a byte or a
> cache line) than the traditional 512-byte disk sector. So a torn-write such 
> that
> "some bytes in a sector persist, other bytes not"
> can occur when crash. AFAICS, however, the checksum for WAL records can
> also support such a torn-write case.

I'm afraid I may be misunderstanding, so let me ask a naive question.

I understood "simply placing WAL on DAX (without any code changes)" means 
placing WAL files on DAX-aware filesystems such as ext4 and xfs, withoug 
modifying Postgres source code.  That is, use the PMEM as a high performance 
storage device.  Is this correct?

Second, does it what you represented as "master" in your test results?

I'd simply like to know what percentage of performance improvement we can 
expect by utilizing PMDK and modifying Postgres source code, and how much 
improvement we consider worthwhile.


Regards
Takayuki Tsunakawa





Re: [HACKERS] Custom compression methods

2021-03-08 Thread Justin Pryzby
On Mon, Mar 08, 2021 at 03:32:39PM +0530, Dilip Kumar wrote:
> On Sun, Mar 7, 2021 at 1:27 AM Justin Pryzby  wrote:
> >
> > On Sat, Mar 06, 2021 at 08:59:16PM +0530, Dilip Kumar wrote:
> > > - Alter table set compression, will not rewrite the old data, so only
> > > the new tuple will be compressed with the new compression method.
> > > - No preserve.
> >
> > In this patch, SET default_toast_compression=lz4 "works" even if 
> > without-lz4,
> > but then CREATE TABLE fails.  You should either allow table creation (as
> > above), or check in check_default_toast_compression() if lz4 is enabled.
> > Its comment about "catalog access" is incorrect now.
> 
> As of now I have made GUC behavior similar to the CREATE TABLE, in
> both case it will throw an error if it is not compiled with lz4
> method.

In the latest patch, CREATE TABLE (t text COMPRESS lz4) fails if --without-lz4.
I think that's the right choice, since otherwise we should also allow ALTER SET
COMPRESSION lz4, which feels wrong.

This comment and associated conditional is still wrong, since there's no
catalog access anymore:

+* If we aren't inside a transaction, or not connected to a database, we
+* cannot do the catalog access necessary to verify the method.  Must
+* accept the value on faith.

This shouldn't refer to "access method" (probably originally my error):

+* When source == PGC_S_TEST, don't throw a hard error 
for a
+* nonexistent table access method, only a NOTICE. See 
comments in
+* guc.h.

I'm not sure why that function is now in guc.c ?

Now I think this comment should just say /* GUC */
+/* Compile-time default */
+char   *default_toast_compression = DEFAULT_TOAST_COMPRESSION;

It sounds like after that, you should merge that part into 0001.

In 0001, configure.ac is missing this :
AC_MSG_RESULT([$with_lz4])

Also, Thomas updated the mac CI to installed LZ4.
However it fails to find the library.  Maybe configure.ac needs to use
pkg-config.  Or maybe the mac build needs to use this - we're not sure.
--with-includes=/usr/local/opt --with-libraries=/usr/local/opt

-- 
Justin




RE: Parallel INSERT (INTO ... SELECT ...)

2021-03-08 Thread houzj.f...@fujitsu.com
> >
> > I've attached an updated set of patches with the suggested locking changes.
> >
> 
> Amit L, others, do let me know if you have still more comments on
> 0001* patch or if you want to review it further?

I took a look into the latest 0001 patch, and it looks good to me.

Best regards,
houzj


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Joel Jacobson
On Mon, Mar 8, 2021, at 21:46, Pavel Stehule wrote:
> so what about?
> 
> CREATE OR REPLACE FUNCTION unnest_slice(anyarray, int)
> RETURNS SETOF anyarray AS $$
> DECLARE r $1%type;
> BEGIN
>   FOREACH r SLICE $2 IN ARRAY $1 --- now $2 should be constant
>   LOOP
> RETURN NEXT r;
>   END LOOP;
> END;
> $$ LANGUAGE plpgsql;

Not sure I understand. Is the suggestion to add "SLICE" as syntactic sugar in 
PL/pgSQL to invoke the proposed two-argument C-version of unnest()?

/Joel



Re: [PoC] Non-volatile WAL buffer

2021-03-08 Thread Takashi Menjo
Hi Tomas,

> Hello Takashi-san,
>
> On 3/5/21 9:08 AM, Takashi Menjo wrote:
> > Hi Tomas,
> >
> > Thank you so much for your report. I have read it with great interest.
> >
> > Your conclusion sounds reasonable to me. My patchset you call "NTT /
> > segments" got as good performance as "NTT / buffer" patchset. I have
> > been worried that calling mmap/munmap for each WAL segment file could
> > have a lot of overhead. Based on your performance tests, however, the
> > overhead looks less than what I thought. In addition, "NTT / segments"
> > patchset is more compatible to the current PG and more friendly to
> > DBAs because that patchset uses WAL segment files and does not
> > introduce any other new WAL-related file.
> >
>
> I agree. I was actually a bit surprised it performs this well, mostly in
> line with the "NTT / buffer" patchset. I've seen significant issues with
> our simple experimental patches, which however went away with larger WAL
> segments. But the "NTT / segments" patch does not have that issue, so
> either our patches were doing something wrong, or perhaps there was some
> other issue (not sure why larger WAL segments would improve that).
>
> Do these results match your benchmarks? Or are you seeing significantly
> different behavior?

I made a performance test for "NTT / segments" and added its results
to my previous report [1], on the same conditions. The updated graph
is attached to this mail. Note that some legends are renamed: "Mapped
WAL file" to "NTT / simple", and "Non-volatile WAL buffer" to "NTT /
buffer."

The graph tells me that "NTT / segments" performs as well as "NTT /
buffer." This matches with the results you reported.

> Do you have any thoughts regarding the impact of full-page writes? So
> far all the benchmarks we did focused on small OLTP transactions on data
> sets that fit into RAM. The assumption was that that's the workload that
> would benefit from this, but maybe that's missing something important
> about workloads producing much larger WAL records? Say, workloads
> working with large BLOBs, bulk loads etc.

I'd say that more work is needed for workloads producing a large
amount of WAL (in the number of records or the size per record, or
both of them). Based on the case Gang reported and I have tried to
reproduce in this thread [2][3], the current inserting and flushing
method can be unsuitable for such workloads. The case was for "NTT /
buffer," but I think it can be also applied to "NTT / segments."

> The other question is whether simply placing WAL on DAX (without any
> code changes) is safe. If it's not, then all the "speedups" are computed
> with respect to unsafe configuration and so are useless. And BTT should
> be used instead, which would of course produce very different results.

I think it's safe, thanks to the checksum in the header of WAL record
(xl_crc in struct XLogRecord). In DAX mode, user data (WAL record
here) is written to the PMEM device by a smaller unit (probably a byte
or a cache line) than the traditional 512-byte disk sector. So a
torn-write such that "some bytes in a sector persist, other bytes not"
can occur when crash. AFAICS, however, the checksum for WAL records
can also support such a torn-write case.

> > I also think that supporting both file I/O and mmap is better than
> > supporting only mmap. I will continue my work on "NTT / segments"
> > patchset to support both ways.
> >
>
> +1
>
> > In the following, I will answer "Issues & Questions" you reported.
> >
> >
> >> While testing the "NTT / segments" patch, I repeatedly managed to crash 
> >> the cluster with errors like this:
> >>
> >> 2021-02-28 00:07:21.221 PST client backend [3737139] WARNING:  creating 
> >> logfile segment just before
> >> mapping; path "pg_wal/00010007002F"
> >> 2021-02-28 00:07:21.670 PST client backend [3737142] WARNING:  creating 
> >> logfile segment just before
> >> mapping; path "pg_wal/000100070030"
> >> ...
> >> 2021-02-28 00:07:21.698 PST client backend [3737145] WARNING:  creating 
> >> logfile segment just before
> >> mapping; path "pg_wal/000100070030"
> >> 2021-02-28 00:07:21.698 PST client backend [3737130] PANIC:  could not 
> >> open file
> >> "pg_wal/000100070030": No such file or directory
> >>
> >> I do believe this is a thinko in the 0008 patch, which does XLogFileInit 
> >> in XLogFileMap. Notice there are multiple
> >> "creating logfile" messages with the ..0030 segment, followed by the 
> >> failure. AFAICS the XLogFileMap may be
> >> called from multiple backends, so they may call XLogFileInit concurrently, 
> >> likely triggering some sort of race
> >> condition. It's fairly rare issue, though - I've only seen it twice from 
> >> ~20 runs.
> >
> > Thank you for your report. I found that rather the patch 0009 has an
> > issue, and that will also cause WAL loss. I should have set
> > use_existent to true, or InstallXlogFileSegment and BasicOpenFile in
> > XLogFileInit can be racy. I 

Re: Make stream_prepare an optional callback

2021-03-08 Thread Amit Kapila
On Mon, Mar 8, 2021 at 2:43 PM Ajin Cherian  wrote:
>
> Hi Hackers,
>
> As part of commit 0aa8a0 , new plugin methods (callbacks) were defined for 
> enabling two_phase commits.
> 5 callbacks were required:
> * begin_prepare
> * prepare
> * commit_prepared
> * rollback_prepared
> * stream_prepare
>
> and 1 callback was optional:
> * filter_prepare
>
> I don't think stream_prepare should be made a required callback for enabling 
> two_phase commits. stream_prepare callback is required when a logical 
> replication slot is configured both for streaming in-progress transactions 
> and two_phase commits. Plugins can and should be allowed to disallow this 
> combination of allowing both streaming and two_phase at the same time. In 
> which case, stream_prepare should be an optional callback.
>

Sounds reasonable to me. I also don't see a reason why we need to make
this a necessary callback. Some plugin authors might just want 2PC
without streaming support.

Markus, others working on logical decoding plugins, do you have any
opinion on this?

-- 
With Regards,
Amit Kapila.




Re: [PATCH] pg_permissions

2021-03-08 Thread Joel Jacobson
On Tue, Mar 9, 2021, at 04:01, Chapman Flack wrote:
> On Sat, Mar 06, 2021 at 08:03:17PM +0100, Joel Jacobson wrote:
> >regclass   |  obj_desc   | grantor | grantee |
> privilege_type | is_grantable
> >
> --+-+-+-++--
> 
> 1. Is there a reason not to make 'grantor' and 'grantee' of type regrole?

I considered it, but this view is tailored for human-use,
to be used by experienced as well as beginner users.

>In other words, s/rolname/oid::regrole/ throughout the view definition.
>It looks the same visually, but should be easier to build on in a larger
>query. 

If using regrole, the users would need to know they would need to cast it to 
text, to search for values, e.g.:

SELECT * FROM pg_permissions WHERE grantee = 'foobar';
ERROR:  invalid input syntax for type oid: "foobar"
LINE 1: SELECT * FROM pg_permissions WHERE grantee = 'foobar';

SELECT * FROM pg_permissions WHERE grantee LIKE 'foo%';
ERROR:  operator does not exist: regrole ~~ unknown
LINE 1: SELECT * FROM pg_permissions WHERE grantee LIKE 'foo%';

> 2. Also to facilitate use in a larger query, how about columns for the
>objid and objsubid, in addition to the human-friendly obj_desc?

I think it's good to prevent users from abusing this view,
by not including oids and other columns needed for proper
integration in larger queries/systems.

Otherwise there is a risk users will be sloppy and just join pg_permissions,
when they really should be joining some specific catalog.

Also, lots of extra columns not needed by humans just makes the view less 
readable,
since you would more often need to \x when the width of the output does't fit.

Personally, I'm on a 15" MacBook Pro and I usually have two 117x24 terminals 
next to each other,
in which both pg_permissions and pg_ownerships output usually fits fine.

>And I'm not sure about using pg_attribute as the regclass for
>attributes; it's nice to look at, but could also plant the wrong idea
>that attributes have pg_attribute as their classid, when it's really
>pg_class with an objsubid. Anyway, there's the human-friendly obj_desc
>to tell you it's a column.

While pg_class is the "origin class", I think we convey more meaningful 
information,
by using the regclass for the table which stores the aclitem[] column,
in your example, pg_attribute. This makes it more obvious to the user the 
permission
is on some column, rather than on the table. In the case where you try to drop 
a user
and don't understand why you can't, and then look in pg_permissions what could 
be the
reason, it's more helpful to show pg_attribute than pg_class, since you 
hopefully then
understand you should revoke permissions for some column, and not the table.

You get this information in obj_desc as well, but I think regclass complements 
it nicely.

And it's also more precise, the permission *is* really on pg_attribute,
it just happens to be that pg_attribute has a multi-key primary key,
where one of the keys is referencing pg_class.oid.

> But I think it would be useful for this view to handle the part of the story
> that involves acldefault() when the stored aclitem[] is null. I've long
> wanted a view that actually shows you all of the permissions that apply
> to something, even the ones you're supposed to Just Know, and indeed
> I wrote such a thing for $work.
> Then you could even query the view for an answer to the question "what
> are all the permissions 'public' (or '-') can exercise here?"

Seems useful, but maybe that's a different view/function?
Could it be integrated into these views without increasing complexity?

/Joel


Re: Removing vacuum_cleanup_index_scale_factor

2021-03-08 Thread Masahiko Sawada
On Tue, Mar 9, 2021 at 7:35 AM Peter Geoghegan  wrote:
>
> On Mon, Mar 8, 2021 at 1:38 PM Tom Lane  wrote:
> > As you say, the history here is a bit convoluted, but it seems like
> > a good principle to avoid interconnections between VACUUM and ANALYZE
> > as much as we can.  I haven't been paying enough attention to this
> > thread to have more insight than that.
>
> The attached patch does what I proposed earlier today: it teaches
> do_analyze_rel() to always set pg_class.reltuples for indexes when it
> would do the same thing for the heap/table relation already. It's now
> uniform in that sense.

Thank you for the patches. I looked at 0001 patch and have a comment:

+* We don't report to the stats collector here because the stats collector
+* only tracks per-table stats.  Reset the changes_since_analyze counter
+* only if we analyzed all columns; otherwise, there is still work for
+* auto-analyze to do.

I think the comment becomes clearer if we add "if doing inherited
stats" at top of the above paragraph since we actually report to the
stats collector in !inh case.

>
> Also included is a patch that removes the
> vacuum_cleanup_index_scale_factor mechanism for triggering an index
> scan during VACUUM -- that's what the second patch does (this depends
> on the first patch, really).

0002 patch looks good to me.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: [HACKERS] logical decoding of two-phase transactions

2021-03-08 Thread vignesh C
On Tue, Mar 9, 2021 at 11:01 AM Amit Kapila  wrote:
>
> On Mon, Mar 8, 2021 at 8:09 PM vignesh C  wrote:
> >
> > On Mon, Mar 8, 2021 at 6:25 PM Amit Kapila  wrote:
> > >
> >
> > I think in case of two_phase option, replicatedPtr and sentPtr never
> > becomes the same which causes this process to hang.
> >
>
> The reason is that because on subscriber you have created a situation
> (PK violation) where it is not able to proceed with initial tablesync
> and then the apply worker is waiting for tablesync to complete, so it
> is not able to process new messages. I think as soon as you remove the
> duplicate row from the table it will be able to proceed.
>
> Now, we can see a similar situation even in HEAD without 2PC though it
> is a bit tricky to reproduce. Basically, when the tablesync worker is
> in SUBREL_STATE_CATCHUP state and it has a lot of WAL to process then
> the apply worker is just waiting for it to finish applying all the WAL
> and won't process any message. So at that time, if you try to stop the
> publisher you will see the same behavior. I have simulated a lot of
> WAL processing by manually debugging the tablesync and not proceeding
> for some time. You can also try by adding sleep after the tablesync
> worker has set the state as SUBREL_STATE_CATCHUP.
>
> So, I feel this is just an expected behavior and users need to
> manually fix the situation where tablesync worker is not able to
> proceed due to PK violation. Does this make sense?
>

Thanks for the detailed explanation, this behavior looks similar to
the issue you described, we can ignore this issue as it seems this
issue is not because of this patch. I also noticed that if we handle
the PK violation error by deleting that record which causes the PK
violation error, the server is able to stop immediately without any
issue.

Regards,
Vignesh




Re: Tablesync early exit

2021-03-08 Thread Peter Smith
On Sun, Mar 7, 2021 at 1:33 PM Amit Kapila  wrote:
>
> On Sun, Mar 7, 2021 at 7:26 AM Peter Smith  wrote:
> >
> > Hi hackers.
> >
> > I propose a small optimization can be added to the tablesync replication 
> > code.
> >
> > This proposal (and simple patch) was first discussed here [1].
> >
>
> It might be better if you attach your proposed patch to this thread.

PSA.


Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Tablesync-early-exit.patch
Description: Binary data


Re: Improvements and additions to COPY progress reporting

2021-03-08 Thread Michael Paquier
On Mon, Mar 08, 2021 at 05:33:40PM +0100, Matthias van de Meent wrote:
> Seems reasonable. PFA updated patches. I've renamed the previous 0003
> to 0002 to keep git-format-patch easy.

Thanks for updating the patch.  0001 has been applied, after tweaking
a bit comments, indentation and the docs.

> This is keeping current behaviour of the implementation as committed
> with 8a4f618e, with the rationale of that patch being that this number
> should mirror the number returned by the copy command.
> 
> I am not opposed to adding another column for `tuples_inserted` and
> changing the logic accordingly (see prototype 0003), but that was not
> in the intended scope of this patchset. Unless you think that this
> should be included in this current patchset, I'll spin that patch out
> into a different thread, but I'm not sure that would make it into
> pg14.

Okay, point taken.  If there is demand for it in the future, we could
extend the existing set of columns.  After thinking more about it the
usecase if not completely clear to me from a monitoring point of
view.

I have not looked at 0002 in details yet, but I am wondering first if
the size estimations in the expected output are actually portable.
Second, I doubt a bit that the extra cycles spent on that are actually
worth the coverage, even if the trick with an AFTER INSERT trigger is
interesting.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] logical decoding of two-phase transactions

2021-03-08 Thread Amit Kapila
On Mon, Mar 8, 2021 at 8:09 PM vignesh C  wrote:
>
> On Mon, Mar 8, 2021 at 6:25 PM Amit Kapila  wrote:
> >
>
> I think in case of two_phase option, replicatedPtr and sentPtr never
> becomes the same which causes this process to hang.
>

The reason is that because on subscriber you have created a situation
(PK violation) where it is not able to proceed with initial tablesync
and then the apply worker is waiting for tablesync to complete, so it
is not able to process new messages. I think as soon as you remove the
duplicate row from the table it will be able to proceed.

Now, we can see a similar situation even in HEAD without 2PC though it
is a bit tricky to reproduce. Basically, when the tablesync worker is
in SUBREL_STATE_CATCHUP state and it has a lot of WAL to process then
the apply worker is just waiting for it to finish applying all the WAL
and won't process any message. So at that time, if you try to stop the
publisher you will see the same behavior. I have simulated a lot of
WAL processing by manually debugging the tablesync and not proceeding
for some time. You can also try by adding sleep after the tablesync
worker has set the state as SUBREL_STATE_CATCHUP.

So, I feel this is just an expected behavior and users need to
manually fix the situation where tablesync worker is not able to
proceed due to PK violation. Does this make sense?

-- 
With Regards,
Amit Kapila.




Re: Replace buffer I/O locks with condition variables (reviving an old patch)

2021-03-08 Thread Julien Rouhaud
On Mon, Mar 08, 2021 at 06:10:36PM +1300, Thomas Munro wrote:
> On Fri, Mar 5, 2021 at 12:12 PM Thomas Munro  wrote:
> > On Fri, Feb 26, 2021 at 7:08 PM Thomas Munro  wrote:
> > > Back in 2016, Robert Haas proposed to replace I/O locks with condition
> > > variables[1].  Condition variables went in and have found lots of
> > > uses, but this patch to replace a bunch of LWLocks and some busy
> > > looping did not.  Since then, it has been tested quite a lot as part
> > > of the AIO project[2], which currently depends on it.  That's why I'm
> > > interested in following up now.  I asked Robert if he planned to
> > > re-propose it and he said I should go for it, so... here I go.
> >
> > I removed a redundant (Size) cast, fixed the wait event name and
> > category (WAIT_EVENT_BUFFILE_XXX is for buffile.c stuff, not bufmgr.c
> > stuff, and this is really an IPC wait, not an IO wait despite the
> > name), updated documentation and pgindented.
> 
> More review and some proposed changes:
> 
> The old I/O lock array was the only user of struct
> LWLockMinimallyPadded, added in commit 6150a1b08a9, and it seems kinda
> strange to leave it in the tree with no user.  Of course it's remotely
> possible there are extensions using it (know of any?).  In the
> attached, I've ripped that + associated commentary out, because it's
> fun to delete dead code.  Objections?

None from me.  I don't know of any extension relying on it, and neither does
codesearch.debian.net.  I would be surprised to see any extension actually
relying on that anyway.

> Since the whole reason for that out-of-line array in the first place
> was to keep BufferDesc inside one cache line, and since it is in fact
> possible to put a new condition variable into BufferDesc without
> exceeding 64 bytes on a 64 bit x86 box, perhaps we should just do that
> instead?  I haven't yet considered other architectures or potential
> member orders.

+1 for adding the cv into BufferDesc.  That brings the struct size to exactly
64 bytes on x86 64 bits architecture.  This won't add any extra overhead to
LOCK_DEBUG cases, as it was already exceeding the 64B threshold, if that even
was a concern.

> I wonder if we should try to preserve user experience a little harder,
> for the benefit of people who have monitoring queries that look for
> this condition.  Instead of inventing a new wait_event value, let's
> just keep showing "BufferIO" in that column.  In other words, the
> change is that wait_event_type changes from "LWLock" to "IPC", which
> is a pretty good summary of this patch.  Done in the attached.  Does
> this make sense?

I think it does make sense, and it's good to preserve this value.

Looking at the patch itself, I don't have much to add it all looks sensible and
I agree with the arguments in the first mail.  All regression tests pass and
documentation builds.

I'm marking this patch as RFC.




Re: New IndexAM API controlling index vacuum strategies

2021-03-08 Thread Peter Geoghegan
On Tue, Mar 2, 2021 at 8:49 PM Masahiko Sawada  wrote:
> On Tue, Mar 2, 2021 at 2:34 PM Peter Geoghegan  wrote:
> > lazy_vacuum_table_and_indexes() should probably not skip index
> > vacuuming when we're close to exceeding the space allocated for the
> > LVDeadTuples array. Maybe we should not skip when
> > vacrelstats->dead_tuples->num_tuples is greater than 50% of
> > dead_tuples->max_tuples? Of course, this would only need to be
> > considered when lazy_vacuum_table_and_indexes() is only called once
> > for the entire VACUUM operation (otherwise we have far too little
> > maintenance_work_mem/dead_tuples->max_tuples anyway).
>
> Doesn't it actually mean we consider how many dead *tuples* we
> collected during a vacuum? I’m not sure how important the fact we’re
> close to exceeding the maintenance_work_mem space. Suppose
> maintenance_work_mem is 64MB, we will not skip both index vacuum and
> heap vacuum if the number of dead tuples exceeds 5592404 (we can
> collect 11184809 tuples with 64MB memory). But those tuples could be
> concentrated in a small number of blocks, for example in a very large
> table case. It seems to contradict the current strategy that we want
> to skip vacuum if relatively few blocks are modified. No?

There are competing considerations. I think that we need to be
sensitive to accumulating "debt" here. The cost of index vacuuming
grows in a non-linear fashion as the index grows (or as
maintenance_work_mem is lowered). This is the kind of thing that we
should try to avoid, I think. I suspect that cases where we can skip
index vacuuming and heap vacuuming are likely to involve very few dead
tuples in most cases anyway.

We should not be sensitive to the absolute number of dead tuples when
it doesn't matter (say because they're concentrated in relatively few
heap pages). But when we overrun the maintenance_work_mem space, then
the situation changes; the number of dead tuples clearly matters just
because we run out of space for the TID array. The heap page level
skew is not really important once that happens.

That said, maybe there is a better algorithm. 50% was a pretty arbitrary number.

Have you thought more about how the index vacuuming skipping can be
configured by users? Maybe a new storage param, that works like the
current SKIP_VACUUM_PAGES_RATIO constant?

-- 
Peter Geoghegan




Re: [HACKERS] logical decoding of two-phase transactions

2021-03-08 Thread Amit Kapila
On Tue, Mar 9, 2021 at 9:15 AM Peter Smith  wrote:
>
> On Mon, Mar 8, 2021 at 4:58 PM vignesh C  wrote:
> >
> > LOGICAL_REP_MSG_TYPE = 'Y',
> > +   LOGICAL_REP_MSG_BEGIN_PREPARE = 'b',
> > +   LOGICAL_REP_MSG_PREPARE = 'P',
> > +   LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
> > +   LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
> > LOGICAL_REP_MSG_STREAM_START = 'S',
> > LOGICAL_REP_MSG_STREAM_END = 'E',
> > LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
> > -   LOGICAL_REP_MSG_STREAM_ABORT = 'A'
> > +   LOGICAL_REP_MSG_STREAM_ABORT = 'A',
> > +   LOGICAL_REP_MSG_STREAM_PREPARE = 'p'
> >  } LogicalRepMsgType;
> > As we start adding more and more features, we will have to start
> > adding more message types, using meaningful characters might become
> > difficult. Should we start using numeric instead for the new feature
> > getting added?
>
> This may or may not become a problem sometime in the future, but I
> think the feedback is not really specific to the current patch set so
> I am skipping it at this time.
>

+1.

-- 
With Regards,
Amit Kapila.




Re: [HACKERS] logical decoding of two-phase transactions

2021-03-08 Thread vignesh C
On Tue, Mar 9, 2021 at 9:14 AM Peter Smith  wrote:
>
> On Mon, Mar 8, 2021 at 4:58 PM vignesh C  wrote:
> >
> > LOGICAL_REP_MSG_TYPE = 'Y',
> > +   LOGICAL_REP_MSG_BEGIN_PREPARE = 'b',
> > +   LOGICAL_REP_MSG_PREPARE = 'P',
> > +   LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
> > +   LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
> > LOGICAL_REP_MSG_STREAM_START = 'S',
> > LOGICAL_REP_MSG_STREAM_END = 'E',
> > LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
> > -   LOGICAL_REP_MSG_STREAM_ABORT = 'A'
> > +   LOGICAL_REP_MSG_STREAM_ABORT = 'A',
> > +   LOGICAL_REP_MSG_STREAM_PREPARE = 'p'
> >  } LogicalRepMsgType;
> > As we start adding more and more features, we will have to start
> > adding more message types, using meaningful characters might become
> > difficult. Should we start using numeric instead for the new feature
> > getting added?
>
> This may or may not become a problem sometime in the future, but I
> think the feedback is not really specific to the current patch set so
> I am skipping it at this time.
>
> If you want, maybe create it as a separate thread, Is it OK?

I was thinking of changing the newly added message types to something
like below:
> LOGICAL_REP_MSG_TYPE = 'Y',
> +   LOGICAL_REP_MSG_BEGIN_PREPARE = 1,
> +   LOGICAL_REP_MSG_PREPARE = 2,
> +   LOGICAL_REP_MSG_COMMIT_PREPARED = 3,
> +   LOGICAL_REP_MSG_ROLLBACK_PREPARED = 4,
> LOGICAL_REP_MSG_STREAM_START = 'S',
> LOGICAL_REP_MSG_STREAM_END = 'E',
> LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
> -   LOGICAL_REP_MSG_STREAM_ABORT = 'A'
> +   LOGICAL_REP_MSG_STREAM_ABORT = 'A',
> +   LOGICAL_REP_MSG_STREAM_PREPARE = 5
>  } LogicalRepMsgType;

Changing these values at a later time may become difficult as it can
break backward compatibility. But if you feel the existing values are
better we can keep it as it is and think of it later when we add more
message types.

Regards,
Vignesh




Re: POC: converting Lists into arrays

2021-03-08 Thread bu...@sohu.com
Hello, here some macros for list_make, now we can using
list_make(...), not list_make1/2/3 ...

#define MACRO_ARGS(...) __VA_ARGS__
#define LIST_MAKE_1_(narg_, postfix_, ...) list_make ## narg_ ## 
postfix_(__VA_ARGS__)
#define LIST_MAKE_2_(...) LIST_MAKE_1_(__VA_ARGS__)
#define LIST_MAKE_3_(...) LIST_MAKE_2_(__VA_ARGS__)

#define list_make(...) LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), 
/*empty*/, __VA_ARGS__)
#define list_make_int(...) LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), 
_int, __VA_ARGS__)
#define list_make_oid(...) LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), 
_oid, __VA_ARGS__)

macro VA_ARGS_NARGS defined in c.h

How to work:
for list_make_int(4,5,6)
step 1: LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(4,5,6), _int, 4,5,6)
setp 2: LIST_MAKE_2_(MACRO_ARGS (3), _int, 4,5,6)
step 3: LIST_MAKE_1_(3, _int, 4,5,6)
step 4: list_make3_int(4,5,6)
step 5: list_make3_impl(T_IntList, ((ListCell) {.int_value = (4)}), ((ListCell) 
{.int_value = (5)}), ((ListCell) {.int_value = (6)}))

Or we can define some public macros, like this:
#define MACRO_ARGS(...) __VA_ARGS__
#define MACRO_COMBIN_1(prefix_, center_, postfix_, ...) prefix_ ## center_ ## 
postfix_(__VA_ARGS__)
#define MACRO_COMBIN_2(...) MACRO_COMBIN_1(__VA_ARGS__)
#define MACRO_COMBIN_3(...) MACRO_COMBIN_2(__VA_ARGS__)

#define list_make(...) MACRO_COMBIN_3(list_make, MACRO_ARGS 
VA_ARGS_NARGS(__VA_ARGS__), /*empty*/, __VA_ARGS__)
#define list_make_int(...) MACRO_COMBIN_3(list_make, MACRO_ARGS 
VA_ARGS_NARGS(__VA_ARGS__), _int, __VA_ARGS__)
#define list_make_oid(...) MACRO_COMBIN_3(list_make, MACRO_ARGS 
VA_ARGS_NARGS(__VA_ARGS__), _oid, __VA_ARGS__)



bu...@sohu.com
 
From: Tom Lane
Date: 2019-02-24 10:24
To: pgsql-hackers
Subject: POC: converting Lists into arrays
For quite some years now there's been dissatisfaction with our List
data structure implementation.  Because it separately palloc's each
list cell, it chews up lots of memory, and it's none too cache-friendly
because the cells aren't necessarily adjacent.  Moreover, our typical
usage is to just build a list by repeated lappend's and never modify it,
so that the flexibility of having separately insertable/removable list
cells is usually wasted.
 
Every time this has come up, I've opined that the right fix is to jack
up the List API and drive a new implementation underneath, as we did
once before (cf commit d0b4399d81).  I thought maybe it was about time
to provide some evidence for that position, so attached is a POC patch
that changes Lists into expansible arrays, while preserving most of
their existing API.
 
The big-picture performance change is that this makes list_nth()
a cheap O(1) operation, while lappend() is still pretty cheap;
on the downside, lcons() becomes O(N), as does insertion or deletion
in the middle of a list.  But we don't use lcons() very much
(and maybe a lot of the existing usages aren't really necessary?),
while insertion/deletion in long lists is a vanishingly infrequent
operation.  Meanwhile, making list_nth() cheap is a *huge* win.
 
The most critical thing that we lose by doing this is that when a
List is modified, all of its cells may need to move, which breaks
a lot of code that assumes it can insert or delete a cell while
hanging onto a pointer to a nearby cell.  In almost every case,
this takes the form of doing list insertions or deletions inside
a foreach() loop, and the pointer that's invalidated is the loop's
current-cell pointer.  Fortunately, with list_nth() now being so cheap,
we can replace these uses of foreach() with loops using an integer
index variable and fetching the next list element directly with
list_nth().  Most of these places were loops around list_delete_cell
calls, which I replaced with a new function list_delete_nth_cell
to go along with the emphasis on the integer index.
 
I don't claim to have found every case where that could happen,
although I added debug support in list.c to force list contents
to move on every list modification, and this patch does pass
check-world with that support turned on.  I fear that some such
bugs remain, though.
 
There is one big way in which I failed to preserve the old API
syntactically: lnext() now requires a pointer to the List as
well as the current ListCell, so that it can figure out where
the end of the cell array is.  That requires touching something
like 150 places that otherwise wouldn't have had to be touched,
which is annoying, even though most of those changes are trivial.
 
I thought about avoiding that by requiring Lists to keep a "sentinel"
value in the cell after the end of the active array, so that lnext()
could look for the sentinel to detect list end.  However, that idea
doesn't really work, because if the list array has been moved, the
spot where the sentinel had been could have been reallocated and
filled with something else.  So this'd offer no defense against the
possibility of a stale ListCell pointer, which is something that
we absolutely need defenses for.  

Re: [HACKERS] logical decoding of two-phase transactions

2021-03-08 Thread Peter Smith
On Mon, Mar 8, 2021 at 4:58 PM vignesh C  wrote:
>
> LOGICAL_REP_MSG_TYPE = 'Y',
> +   LOGICAL_REP_MSG_BEGIN_PREPARE = 'b',
> +   LOGICAL_REP_MSG_PREPARE = 'P',
> +   LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
> +   LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
> LOGICAL_REP_MSG_STREAM_START = 'S',
> LOGICAL_REP_MSG_STREAM_END = 'E',
> LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
> -   LOGICAL_REP_MSG_STREAM_ABORT = 'A'
> +   LOGICAL_REP_MSG_STREAM_ABORT = 'A',
> +   LOGICAL_REP_MSG_STREAM_PREPARE = 'p'
>  } LogicalRepMsgType;
> As we start adding more and more features, we will have to start
> adding more message types, using meaningful characters might become
> difficult. Should we start using numeric instead for the new feature
> getting added?

This may or may not become a problem sometime in the future, but I
think the feedback is not really specific to the current patch set so
I am skipping it at this time.

If you want, maybe create it as a separate thread, Is it OK?


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: New IndexAM API controlling index vacuum strategies

2021-03-08 Thread Peter Geoghegan
On Mon, Mar 8, 2021 at 10:57 AM Robert Haas  wrote:
> Yes, I agree that it's good to postpone this to a future release, and
> that thinking through the consequences is not so easy.

The current plan is to commit something like Masahiko's
skip_index_vacuum.patch for Postgres 14. The latest version of that
patch (a reduced-scope version of Masahiko's patch without any changes
to MaxHeapTuplesPerPage) is available from:

https://postgr.es/m/CAD21AoAtZb4+HJT_8RoOXvu4HM-Zd4HKS3YSMCH6+-W=bdy...@mail.gmail.com

The idea is to "unify the vacuum_cleanup_index_scale_factor feature
from Postgres 11 with the INDEX_CLEANUP feature from Postgres 12".
This is the broader plan to make that "unification" happen for
Postgres 14:

https://postgr.es/m/CAH2-WzkYaDdbWOEwSSmC65FzF_jRLq-cxrYtt-2+ASoA156X=w...@mail.gmail.com

So, as I said, any change to MaxHeapTuplesPerPage is now out of scope
for Postgres 14.

> One possible
> consequence that I'm concerned about is sequential scan performance.
> For an index scan, you just jump to the line pointer you want and then
> go get the tuple, but a sequential scan has to loop over all the line
> pointers on the page, and skipping a lot of dead ones can't be
> completely free. A small increase in MaxHeapTuplesPerPage probably
> wouldn't matter, but the proposed increase of almost 10x (291 -> 2042)
> is a bit scary.

I agree. Maybe the real problem here is that MaxHeapTuplesPerPage is a
generic constant. Perhaps it should be something that can vary by
table, according to practical table-level considerations such as
projected tuple width given the "shape" of tuples for that table, etc.

Certain DB systems that use bitmap indexes extensively allow this to
be configured per-table. If you need to encode a bunch of TIDs as
bitmaps, you first need some trivial mapping from TIDs to integers
(before you even build the bitmap, much less compress it). So even
without VACUUM there is a trade-off to be made. It is *roughly*
comparable to the trade-off you make when deciding on a page size.

What I really want to do for Postgres 14 is to establish the principle
that index vacuuming is theoretically optional -- in all cases. There
will be immediate practical benefits, too. I think it's important to
remove the artificial behavioral differences between cases where there
are 0 dead tuples and cases where there is only 1. My guess is that
99%+ append-only tables are far more common than 100% append-only
tables in practice.

> It's also a little hard to believe that letting almost
> 50% of the total space on the page get chewed up by the line pointer
> array is going to be optimal. If that happens to every page while the
> amount of data stays the same, the table must almost double in size.
> That's got to be bad.

I think that we should be prepared for a large diversity of conditions
within a given table. It follows that we should try to be adaptive.

The reduced-scope patch currently tracks LP_DEAD line pointers at the
heap page level, and then applies a count of heap blocks with one or
more LP_DEAD line pointers (could be existing or just pruned by this
VACUUM) to determine a count of heap pages. This is used to determine
a threshold at which index vacuuming should be forced. Currently we
have a multiplier constant called SKIP_VACUUM_PAGES_RATIO, which is
0.01 -- 1% of heap blocks. Of course, it's possible that LP_DEAD line
pointers will be very concentrated, in which case we're more
aggressive about skipping index vacuuming (if you think of it in terms
of dead TIDs instead of heap blocks we're aggressive, that is). The
other extreme exists too: LP_DEAD line pointers may instead be spread
diffusively across all heap pages, in which case we are unlikely to
ever skip index vacuuming outside of cases like anti-wraparound vacuum
or insert-driven vacuum to set VM bits.

The next iteration of the high-level "granular vacuum" project (which
will presumably target Postgres 15) should probably involve more
complicated, qualitative judgements about LP_DEAD line pointers in the
heap. Likewise it should care about individual needs of indexes, which
is something that Masahiko experimented with in earlier drafts of the
patch on this thread. The needs of each index can be quite different
with bottom-up index deletion. We may in fact end up adding a new,
moderately complicated cost model -- it may have to be modelled as an
optimization problem.

In short, I think that thinking more about the logical state of the
database during VACUUM is likely to pay-off ("heap blocks vs dead
tuples" is one part of that). VACUUM should be a little more
qualitative, and a little less quantitative. The fact that we
currently don't stuff like that (unless bottom-up index deletion
counts) is not an inherent limitation of the design of VACUUM. I'm not
entirely sure how far it can be pushed, but it seems quite promising.

> The whole thing would be more appealing if there
> were some way to exert exponentially increasing back-pressure on 

Re: [PATCH] pg_permissions

2021-03-08 Thread Chapman Flack
On Sat, Mar 06, 2021 at 08:03:17PM +0100, Joel Jacobson wrote:
>regclass   |  obj_desc   | grantor | grantee |
privilege_type | is_grantable
>
--+-+-+-++--

1. Is there a reason not to make 'grantor' and 'grantee' of type regrole?
   In other words, s/rolname/oid::regrole/ throughout the view definition.
   It looks the same visually, but should be easier to build on in a larger
   query.

   Hmm, ok, a grantee of 'public' can't be expressed as a regrole. This
   seems an annoying little corner.[1] It can be represented by 0::regrole,
   but that displays as '-'. Hmm again, you can even '-'::regrole and get 0.

2. Also to facilitate use in a larger query, how about columns for the
   objid and objsubid, in addition to the human-friendly obj_desc?
   And I'm not sure about using pg_attribute as the regclass for
   attributes; it's nice to look at, but could also plant the wrong idea
   that attributes have pg_attribute as their classid, when it's really
   pg_class with an objsubid. Anyway, there's the human-friendly obj_desc
   to tell you it's a column.

On 03/08/21 12:14, Joel Jacobson wrote:
> On Mon, Mar 8, 2021, at 15:35, Joe Conway wrote:
>> While this is interesting and probably useful for troubleshooting, it does 
>> not
>> provide the complete picture if what you care about is something like "what
>> stuff can joel do in my database".
> 
> Good point, I agree.
> 
> I think that's a different more complicated use-case though.

I could agree that the role membership and inherit/noinherit part is
a more complicated problem that could be solved by a larger query built
over this view (facilitated by giving grantor and grantee regrole type)
and some recursive-CTEness with the roles.

But I think it would be useful for this view to handle the part of the story
that involves acldefault() when the stored aclitem[] is null. I've long
wanted a view that actually shows you all of the permissions that apply
to something, even the ones you're supposed to Just Know, and indeed
I wrote such a thing for $work.

Then you could even query the view for an answer to the question "what
are all the permissions 'public' (or '-') can exercise here?"

On 03/06/21 19:08, Joel Jacobson wrote:
> SELECT * FROM ownerships WHERE rolname = 'joel' LIMIT 5;
>  regclass | obj_desc  | rolname
> --+---+-

Here again, I'd repeat the suggestions to present the owner as a regrole
(and in this case there is no need to deal with 'public'), and to include
the objid as well as the human-friendly obj_desc.

Regards,
-Chap




Re: TRUNCATE on foreign table

2021-03-08 Thread Amit Langote
On Tue, Mar 9, 2021 at 2:24 AM Ibrar Ahmed  wrote:
> The patch (pgsql14-truncate-on-foreign-table.v2.patch) does not apply 
> successfully.
>
> http://cfbot.cputube.org/patch_32_2972.log
>
> patching file contrib/postgres_fdw/expected/postgres_fdw.out
> Hunk #2 FAILED at 9179.
> 1 out of 2 hunks FAILED -- saving rejects to file 
> contrib/postgres_fdw/expected/postgres_fdw.out.rej
>
> As this is a minor change therefore I have updated the patch. Please take a 
> look.

Thanks for updating the patch.  I was able to apply it successfully
though I notice it doesn't pass make check-world.

Specifically, it fails the src/test/subscription/013_partition.pl
test.  The problem seems to be that worker.c: apply_handle_truncate()
hasn't been updated to add entries to relids_extra for partitions
expanded from a partitioned table, like ExecuteTruncate() does.  That
leads to relids and relids_extra having different lengths, which trips
the Assert in ExecuteTruncateGuts().

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: pg_stat_statements oddity with track = all

2021-03-08 Thread Julien Rouhaud
On Mon, Mar 08, 2021 at 06:03:59PM +0100, Magnus Hagander wrote:
> On Sun, Mar 7, 2021 at 8:39 AM Julien Rouhaud  wrote:
> >
> > Yes I was a bit worried about the possible extra text entry.  I kept things
> > simple until now as the general opinion was that entries existing for both 
> > top
> > level and nested level should be very rare so adding extra code and 
> > overhead to
> > spare a few query texts wouldn't be worth it.
> >
> > I think that using a flag might be a bit expensive, as we would have to make
> > sure that at least one of the possible two entries has it.  So if there are 
> > 2
> > entries and the one with the flag is evicted, we would have to transfer the
> > flag to the other one, and check the existence of the flag when allocatin a 
> > new
> > entry.  And all of that has to be done holding an exclusive lock on 
> > pgss->lock.
> 
> Yeah, we'd certainly want to minimize things. But what if they both
> have the flag at that point? Then we wouldn't have to care on
> eviction? But yes, for new allications we'd have to look up if the
> query existed with the other value of the flag, and copy it over in
> that case.

I think that we might be able to handle that without a flag.  The only thing
that would need to be done is when creating an entry, look for an existing
entry with the opposite flag, and if there's simply use the same
(query_offset, query_len) info.  This doesn't sound that expensive.

The real pain point will be that the garbage collection phase
will become way more expensive as it will now have to somehow maintain that
knowledge, which will require additional lookups for each entry.  I'm a bit
concerned about that, especially with the current heuristic to schedule garbage
collection.  For now, need_qc_qtext says that we have to do it if the extent is
more than 512 (B) * pgss_max.  This probably doesn't work well for people using
ORM as they tend to generate gigantic SQL queries.

If we implement query text deduplication, should we add another GUC for that
"512" magic value so that people can minimize the gc overhead if they know they
have gigantic queries, or simply don't mind bigger qtext file?

> > Maybe having a new hash table (without the toplevel flag) for those query 
> > text
> > might be better, or maybe pgss performance is already so terrible when you 
> > have
> > to regularly evict entries that it wouldn't make any real difference.
> >
> > Should I try to add some extra code to make sure that we only store the 
> > query
> > text once, or should I document that there might be duplicate, but we expect
> > that to be very rare?
> 
> If we expect it to be rare, I think it might be reasonable to just
> document that. But do we really have a strong argument for it being
> rare?

I don't that think that anyone really had a strong argument, mostly gut
feeling.  Note that pg_stat_kcache already implemented that toplevel flags, so
if people are using that extension in a recent version they might have some
figures to show.  I'll ping some people that I know are using it.

One good argument would be that gigantic queries generated by ORM should always
be executed as top level statements.

I previously tried with the postgres regression tests, which clearly isn't a
representative workload, and as far as I can see the vast majority of queries
executed bost as top level and nested level are DDL implying recursion (e.g. a
CREATE TABLE with underlying index creation).




Re: Disallow SSL compression?

2021-03-08 Thread Michael Paquier
On Sat, Mar 06, 2021 at 10:39:52AM +0900, Michael Paquier wrote:
> Okay, cool.  I'd rather wait more for Peter before doing anything, so
> if there are no objections, I'll look at that stuff again at the
> beginning of next week and perhaps apply it.  If you wish to take care
> of that yourself, please feel free to do so, of course.

So, I have looked at the proposed patch in details, fixed the
documentation of pg_stat_ssl where compression was still listed,
checked a couple of things with and without OpenSSL, across past major
PG versions with OpenSSL 1.0.2 to see if compression was getting
disabled correctly.  And things look all good, so applied.
--
Michael


signature.asc
Description: PGP signature


Re: 64-bit XIDs in deleted nbtree pages

2021-03-08 Thread Peter Geoghegan
On Sun, Mar 7, 2021 at 8:52 PM Masahiko Sawada  wrote:
> Yeah, I think that's right.
>
> Perhaps we can do something so that autovacuums triggered by
> autovacuum_vacuum_insert_scale_factor are triggered on only a true
> insert-only case (e.g., by checking if n_dead_tup is 0).

Right -- that's really what it would mean to "remove
vacuum_cleanup_index_scale_factor in the backbranches".

I now think that it won't even be necessary to make many changes
within VACUUM ANALYZE to avoid unwanted side-effects from removing
vacuum_cleanup_index_scale_factor, per my mail to Tom today:

https://postgr.es/m/cah2-wzknxdcomjhqo4suxvfk_q1171gjo2zghz1y6pion6u...@mail.gmail.com

I'm starting to lean towards "removing
vacuum_cleanup_index_scale_factor" in Postgres 13 and master only,
purely to fix the two issues in Postgres 13 (the insert-driven vacuum
issue and the deduplication stats issue I go into in the mail I link
to). A much more conservative approach should be used to fix the more
superficial issue -- the issue of getting an accurate value (for
pg_class.teltuples) from "info->num_heap_tuples". As discussed
already, the conservative fix is to delay reading
"info->num_heap_tuples" until btvacuumcleanup(), even in cases where
there are btbulkdelete() calls for the VACUUM.

Then we can then revisit your patch to make vacuumlazy.c skip index
vacuuming when there are very few dead tuples, but more than 0 dead
tuples [1]. I should be able to commit that for Postgres 14.

(I will probably finish off my other patch to make nbtree VACUUM
recycle pages deleted during the same VACUUM operation last of all.)

[1] 
https://postgr.es/m/CAD21AoAtZb4+HJT_8RoOXvu4HM-Zd4HKS3YSMCH6+-W=bdy...@mail.gmail.com
--
Peter Geoghegan




Re: cleanup temporary files after crash

2021-03-08 Thread Tomas Vondra
Hi,

Let's move this patch forward. Based on the responses, I agree the
default behavior should be to remove the temp files, and I think we
should have the GUC (on the off chance that someone wants to preserve
the temporary files for debugging or whatever other reason).

I propose to rename the GUC to remove_temp_files_after_crash, I think
"remove" is a bit clearer than "cleanup". I've also reworded the sgml
docs a little bit.

Attached is a patch with those changes. Barring objections, I'll get
this committed in the next couple days.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
>From 70cc3bf303585f1ffcecbace43d13d3d4f545f1b Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 25 May 2020 00:08:20 -0300
Subject: [PATCH] Control temporary files removal after crash

A new GUC remove_temp_files_after_crash controls whether temporary
files are removed after a crash. Successive crashes could result in
useless storage usage until service is restarted. It could be the case
on host with inadequate resources. This manual intervention for some
environments is not desirable. This GUC is marked as SIGHUP hence you
don't have to interrupt the service to change it.

The current behavior introduces a backward incompatibility (minor one)
which means Postgres will reclaim temporary file space after a crash.
The main reason is that we favor service continuity over debugging.
---
 doc/src/sgml/config.sgml  |  18 ++
 src/backend/postmaster/postmaster.c   |   5 +
 src/backend/storage/file/fd.c |  12 +-
 src/backend/utils/misc/guc.c  |   9 +
 src/backend/utils/misc/postgresql.conf.sample |   2 +
 src/include/postmaster/postmaster.h   |   1 +
 src/test/recovery/t/022_crash_temp_files.pl   | 194 ++
 7 files changed, 236 insertions(+), 5 deletions(-)
 create mode 100644 src/test/recovery/t/022_crash_temp_files.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 967de73596..1b102960d6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9665,6 +9665,24 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
+ 
+  remove_temp_files_after_crash (boolean)
+  
+   remove_temp_files_after_crash configuration parameter
+  
+  
+  
+   
+When set to on, PostgreSQL will automatically
+remove temporary files after a backend crash. When disabled, the temporary
+files are retained after a crash, which may be useful in some circumstances
+(e.g. during debugging). It may however result in accumulation of many
+useless files, or possibly even running out of disk space.
+It defaults to on.
+   
+  
+ 
+
  
   data_sync_retry (boolean)
   
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index edab95a19e..2338892c33 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -242,6 +242,7 @@ bool		Db_user_namespace = false;
 bool		enable_bonjour = false;
 char	   *bonjour_name;
 bool		restart_after_crash = true;
+bool		remove_temp_files_after_crash = true;
 
 /* PIDs of special child processes; 0 when not running */
 static pid_t StartupPID = 0,
@@ -3975,6 +3976,10 @@ PostmasterStateMachine(void)
 		ereport(LOG,
 (errmsg("all server processes terminated; reinitializing")));
 
+		/* remove leftover temporary files after a crash */
+		if (remove_temp_files_after_crash)
+			RemovePgTempFiles();
+
 		/* allow background workers to immediately restart */
 		ResetBackgroundWorkerCrashTimes();
 
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index b58502837a..dfc3c2fe3e 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -3024,11 +3024,13 @@ CleanupTempFiles(bool isCommit, bool isProcExit)
  * remove any leftover files created by OpenTemporaryFile and any leftover
  * temporary relation files created by mdcreate.
  *
- * NOTE: we could, but don't, call this during a post-backend-crash restart
- * cycle.  The argument for not doing it is that someone might want to examine
- * the temp files for debugging purposes.  This does however mean that
- * OpenTemporaryFile had better allow for collision with an existing temp
- * file name.
+ * During post-backend-crash restart cycle, this routine could be called if
+ * remove_temp_files_after_crash GUC is enabled. Multiple crashes while
+ * queries are using temp files could result in useless storage usage that can
+ * only be reclaimed by a service restart. The argument against enabling it is
+ * that someone might want to examine the temp files for debugging purposes.
+ * This does however mean that OpenTemporaryFile had better allow for
+ * collision with an existing temp file name.
  *
  * NOTE: this function and its subroutines generally report syscall 

Re: Boundary value check in lazy_tid_reaped()

2021-03-08 Thread Masahiko Sawada
On Mon, Mar 8, 2021 at 7:16 PM Peter Eisentraut
 wrote:
>
> On 21.01.21 14:11, Masahiko Sawada wrote:
> > Agreed. bsearch with bound check showed a reasonable improvement in my
> > evaluation in terms of performance. Regarding memory efficiency, we
> > can experiment with other methods later.
> >
> > I've attached the patch that adds a bound check for encoded
> > itermpointers before bsearch() in lazy_tid_reaped() and inlines the
> > function.
>
> Do you have any data showing the effect of inlining lazy_tid_reaped()?
> I mean, it probably won't hurt, but it wasn't part of the original patch
> that you tested, so I wonder whether it has any noticeable effect.

I've done some benchmarks while changing the distribution of where
dead tuples exist within the table. The table size is 4GB and 20% of
total tuples are dirty. Here are the results of index vacuum execution
time:

1. Updated evenly the table (every block has at least one dead tuple).
master  : 8.15
inlining  : 4.84
not-inlinning  : 5.01

2. Updated the middle of the table.
master  : 8.71
inlining  : 3.51
not-inlinning  : 3.58

3. Updated both the beginning and the tail of the table.
master  : 8.44
inlining  : 3.46
not-inlinning  : 3.50

There is no noticeable effect of inlining lazy_tid_reaped(). So it
would be better to not do that.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Proposal: Save user's original authenticated identity for logging

2021-03-08 Thread Jacob Champion
On Sat, 2021-03-06 at 18:33 +0100, Magnus Hagander wrote:
> In fact, if we're storing it in the Port, why
> are we even passing it as a separate parameter to check_usermap --
> shouldn't that one always use this same value?

Ah, and now I remember why I didn't consolidate this to begin with.
Several auth methods perform some sort of translation before checking
the usermap: cert pulls the CN out of the Subject DN, SSPI and GSS can
optionally strip the realm, etc.

> ISTM that it could be
> quite confusing if the logged value is different from whatever we
> apply to the user mapping?

Maybe. But it's an accurate reflection of what's actually happening,
and that's the goal of the patch: show enough information to be able to
audit who's logging in. The certificates

/OU=ACME Ltd./C=US/CN=pchampion

and

/OU=Postgres/C=GR/CN=pchampion

are different identities, but Postgres will silently authorize them to
log in as the same user. In my opinion, hiding that information makes
things more confusing in the long term, not less.

--Jacob


Re: Let people set host(no)ssl settings from initdb

2021-03-08 Thread Michael Paquier
On Mon, Mar 08, 2021 at 06:13:14PM -0500, Andrew Dunstan wrote:
> What is the point of doing that if we're going to reject the patch as
> discussed upthread?

I have read again this thread, and still understand that this is the
consensus that has been reached.  The CF entry has been updated to
reflect that.
--
Michael


signature.asc
Description: PGP signature


RE: Implementing Incremental View Maintenance

2021-03-08 Thread tsunakawa.ta...@fujitsu.com
From: Thomas Munro 
> It's probably time to move forward with the plan of pushing the
> results into a commitfest.postgresql.org API, and then making Magnus
> et al write the email spam code with a preferences screen linked to
> your community account :-D

+1
I wish to see all the patch status information on the CF app.


Regards
Takayuki Tsunakawa



Re: Implementing Incremental View Maintenance

2021-03-08 Thread Thomas Munro
On Tue, Mar 9, 2021 at 1:22 PM Yugo NAGATA  wrote:
> On Mon, 8 Mar 2021 15:42:00 -0500
> Andrew Dunstan  wrote:
> > (A useful feature of the cfbot might be to notify the authors and
> > reviewers when it detects bitrot for a previously passing entry.)
>
> +1
> The feature notifying it authors seems to me nice.

Nice idea.  I was initially afraid of teaching cfbot to send email,
for fear of creating an out of control spam machine.  Probably the
main thing would be the ability to interact with it to turn it on/off.
It's probably time to move forward with the plan of pushing the
results into a commitfest.postgresql.org API, and then making Magnus
et al write the email spam code with a preferences screen linked to
your community account :-D




Re: Implementing Incremental View Maintenance

2021-03-08 Thread Yugo NAGATA
On Mon, 8 Mar 2021 15:42:00 -0500
Andrew Dunstan  wrote:

> 
> On 2/18/21 9:01 PM, Yugo NAGATA wrote:
> > On Thu, 18 Feb 2021 19:38:44 +0800
> > Andy Fan  wrote:
> >
> >> On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA  wrote:
> >>
> >>> Hi,
> >>>
> >>> Attached is a rebased patch (v22a).
> >>>
> >> Thanks for the patch. Will you think posting a patch with the latest commit
> >> at that
> >> time is helpful? If so, when others want to review it,  they know which
> >> commit to
> >> apply the patch without asking for a new rebase usually.
> > I rebased the patch because cfbot failed.
> > http://cfbot.cputube.org/
> >
> 
> It's bitrotted a bit more dues to commits bb437f995d and 25936fd46c

Thank you for letting me konw.  I'll rebase it soon.

> 
> 
> (A useful feature of the cfbot might be to notify the authors and
> reviewers when it detects bitrot for a previously passing entry.)

+1
The feature notifying it authors seems to me nice.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 




Re: Optimising latch signals

2021-03-08 Thread Thomas Munro
On Tue, Mar 9, 2021 at 12:20 PM Alvaro Herrera  wrote:
> On 2021-Mar-03, Thomas Munro wrote:
> > On Mon, Mar 1, 2021 at 2:29 PM Thomas Munro  wrote:
> > > Time to watch the buildfarm to find out if my speculation about
> > > illumos is correct...
> >
> > I just heard that damselfly's host has been decommissioned with no
> > immediate plan for a replacement.  That was the last of the
> > Solaris-family animals testing master.  It may be some time before I
> > find out if my assumptions broke something on that OS...
>
> Hi, I don't know if you realized but we have two new Illumos members
> now (haddock and hake), and they're both failing initdb on signalfd()
> problems.

Ah, cool.  I'd been discussing this with their owner, who saw my
message and wanted to provide replacements.  Nice to see these start
up even though I don't love the colour of their first results.  In
off-list emails, we got as far as determining that signalfd() fails on
illumos when running inside a zone (= container), because
/dev/signalfd is somehow not present.  Apparently it works when
running on the main host.  Tracing revealed that it's trying to open
that device and getting ENOENT here:

running bootstrap script ... FATAL:  XX000: signalfd() failed
LOCATION:  InitializeLatchSupport, latch.c:279

I'll wait a short time while he tries to see if that can be fixed (I
have no clue if it's a configuration problem in some kind of zone
creation scripts, or a bug, or what).  If not, my fallback plan will
be to change it to default to WAIT_USE_POLL on that OS until it can be
fixed.




Re: Proposal: Save user's original authenticated identity for logging

2021-03-08 Thread Jacob Champion
On Mon, 2021-03-08 at 22:16 +, Jacob Champion wrote:
> On Sat, 2021-03-06 at 18:33 +0100, Magnus Hagander wrote:
> > With this we store the same value as the authn and as
> > port->gss->princ, and AFAICT it's only used once. Seems we could just
> > use the new field for the gssapi usage as well? Especially since that
> > usage only seems to be there in order to do the gssapi specific
> > logging of, well, the same thing.
> > 
> > [...]
> 
> Seems reasonable; I'll consolidate them.

A slight hitch in the plan, for the GSS side... port->gss->princ is
exposed by pg_stat_gssapi. I can switch this to use port->authn_id
easily enough.

But it seems like the existence of a user principal for the connection
is independent of whether or not you're using that principal as your
identity. For example, you might connect via a "hostgssenc ... trust"
line in the HBA. (This would be analogous to presenting a user
certificate over TLS but not using it to authenticate to the database.)
I'd argue that the principal should be available through the stats view
in this case as well, just like you can see a client DN in pg_stat_ssl
even if you're using trust auth.

The server doesn't currently support that -- gss->princ is only
populated in the gss auth case, as far as I can tell -- but if I remove
gss->princ entirely, then it'll be that much more work for someone who
wants to expose that info later. I think it should remain independent.

Thoughts?

--Jacob


Re: documentation fix for SET ROLE

2021-03-08 Thread David G. Johnston
On Mon, Mar 8, 2021 at 4:41 PM David G. Johnston 
wrote:

> On Thu, Feb 18, 2021 at 6:18 PM Bossart, Nathan 
> wrote:
>
>> On 2/17/21 2:12 PM, David G. Johnston wrote:
>> > On Wednesday, February 17, 2021, Bossart, Nathan > > > wrote:
>> >
>> >
>> > postgres=# ALTER ROLE test1 SET ROLE test2;
>> > ALTER ROLE
>> >
>> >
>> > I would not have expected this to work - “role” isn’t a
>> > configuration_parameter.  Its actually cool that it does, but this doc
>> fix
>> > should address this oversight as well.
>>
>> Here's a patch that adds "role" and "session authorization" as
>> configuration parameters, too.
>>
>>
> You will want to add this to the commitfest if you haven't already.
>
> I would suggest adding a section titled "Identification" and placing these
> under that.
>
> Reading it over it looks good.  One point though: SET and SET ROLE are
> indeed "at run-time" (not 'run time').  ALTER ROLE and ALTER DATABASE
> should be considered "at connection-time" just like the command-line
> options.
>
>
Also, as a nearby email just reminded me, the determination of which role
name is used to figure out default settings is the presented user name, not
the one that would result from a connection-time role change as described
here - though this should be tested, and then documented.

David J.


Re: documentation fix for SET ROLE

2021-03-08 Thread David G. Johnston
On Thu, Feb 18, 2021 at 6:18 PM Bossart, Nathan  wrote:

> On 2/17/21 2:12 PM, David G. Johnston wrote:
> > On Wednesday, February 17, 2021, Bossart, Nathan  > > wrote:
> >
> >
> > postgres=# ALTER ROLE test1 SET ROLE test2;
> > ALTER ROLE
> >
> >
> > I would not have expected this to work - “role” isn’t a
> > configuration_parameter.  Its actually cool that it does, but this doc
> fix
> > should address this oversight as well.
>
> Here's a patch that adds "role" and "session authorization" as
> configuration parameters, too.
>
>
You will want to add this to the commitfest if you haven't already.

I would suggest adding a section titled "Identification" and placing these
under that.

Reading it over it looks good.  One point though: SET and SET ROLE are
indeed "at run-time" (not 'run time').  ALTER ROLE and ALTER DATABASE
should be considered "at connection-time" just like the command-line
options.

David J.


Re: cursor sensitivity misunderstanding

2021-03-08 Thread David G. Johnston
On Thu, Feb 25, 2021 at 8:37 AM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 18.02.21 19:14, Peter Eisentraut wrote:
> > On 18.02.21 17:11, David G. Johnston wrote:
> >> The OP was doing a course based on Oracle and was confused regarding
> >> our behavior.  The documentation failed to help me provide a useful
> >> response, so I'd agree there is something here that needs reworking if
> >> not outright fixing.
> >
> > According to the piece of the standard that I posted, the sensitivity
> > behavior here is implementation-dependent (not even -defined), so both
> > implementations are correct.
> >
> > But the poster was apparently also confused by the same piece of
> > documentation.
>
> I came up with the attached patch to sort this out a bit.  It does not
> change any cursor behavior.  But the documentation now uses the terms
> more correctly and explains the differences between SQL and the
> PostgreSQL implementation better, I think.
>

thanks!, though this seems like the wrong approach.  Simply noting that our
cursor is not standard compliant (or at least we don't implement a
standard-compliant sensitive cursor) should suffice.  I don't really get
the point of adding ASENSITIVE if we don't have SENSITIVE too.  I'm also
unfamiliar with the standard default behaviors to comment on where we
differ there - but that should be easy enough to address.

I would suggest limiting the doc change to pointing out that we do allow
for a standard-compliant INSENSITIVE behaving cursor - one that precludes
local sensitively via the FOR SHARE and FOR UPDATE clauses - by adding that
keyword.  Otherwise, while the cursor is still (and always) insensitive
globally the cursor can become locally sensitive implicitly by including a
FOR UPDATE or FOR SHARE clause in the query.  Then maybe consider improving
the notes section through subtraction once a more clear initial
presentation has been made to the reader.

David J.


Re: Optimising latch signals

2021-03-08 Thread Alvaro Herrera
On 2021-Mar-03, Thomas Munro wrote:

> On Mon, Mar 1, 2021 at 2:29 PM Thomas Munro  wrote:
> > Time to watch the buildfarm to find out if my speculation about
> > illumos is correct...
> 
> I just heard that damselfly's host has been decommissioned with no
> immediate plan for a replacement.  That was the last of the
> Solaris-family animals testing master.  It may be some time before I
> find out if my assumptions broke something on that OS...

Hi, I don't know if you realized but we have two new Illumos members
now (haddock and hake), and they're both failing initdb on signalfd()
problems.


-- 
Álvaro Herrera   Valdivia, Chile




Re: non-HOT update not looking at FSM for large tuple update

2021-03-08 Thread Matthias van de Meent
On Mon, 8 Mar 2021 at 16:25, Floris Van Nee  wrote:
>
> > I've added this to the commitfest as a bug fix and added you as an author.
>
> Thanks. Patch looks good to me, but I guess there needs to be someone else 
> reviewing too?
> Also, would this be a backpatchable bugfix?
>
> -Floris
>

This patch fails to consider that len may be bigger than
MaxHeapTupleSize * 0.98, which in this case triggers a reproducable
PANIC:

=# CREATE TABLE t_failure (a int, b text) WITH (fillfactor = 10); --
force the new FSM calculation for large tuples
CREATE TABLE
=# ALTER TABLE t_failure ALTER COLUMN b SET STORAGE plain;
ALTER TABLE
=# INSERT INTO t_failure (SELECT FROM generate_series(1, 32)); -- use
up 32 line pointers on the first page.
INSERT 0 32
=# DELETE FROM t_failure;
DELETE 32
=# VACUUM (TRUNCATE OFF) t_failure; -- we now have a page that has
MaxHeapTupleSize > free space > 98% MaxHeapTupleSize
VACUUM
=# INSERT INTO t_failure (select 1, string_agg('1', '') from
generate_series(1, 8126));
PANIC:  failed to add tuple to page
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

A possible solution should always request at least the size of the
requested tuple, e.g.:
- targetFreeSpace = MaxHeapTupleSize - (MaxHeapTupleSize * 2 / 100);
+ targetFreeSpace = Max(len, MaxHeapTupleSize - (MaxHeapTupleSize * 2 / 100));


One different question I have, though, is why we can't "just" teach
vacuum to clean up trailing unused line pointers. As in, can't we trim
the line pointer array when vacuum detects that the trailing line
pointers on the page are all unused?

The only documentation that I could find that this doesn't happen is
in the comment on PageIndexTupleDelete and PageRepairFragmentation,
both not very descriptive on why we can't shrink the page->pd_linp
array. One is "Unlike heap pages, we compact out the line pointer for
the removed tuple." (Jan. 2002), and the other is "It doesn't remove
unused line pointers! Please don't change this." (Oct. 2000), but I
can't seem to find the documentation / conversations on the
implications that such shrinking would have.

With regards,

Matthias van de Meent.




Re: Let people set host(no)ssl settings from initdb

2021-03-08 Thread Andrew Dunstan


On 3/8/21 11:23 AM, Ibrar Ahmed wrote:
>
>
> On Thu, Mar 4, 2021 at 7:25 AM Michael Paquier  > wrote:
>
> On Wed, Mar 03, 2021 at 03:07:30PM +0100, Peter Eisentraut wrote:
> > I think there is enough sustained opposition to this patch that
> we can mark
> > this as rejected in the commitfest.
>
> +1.
> --
> Michael
>
>
> The patch
> (v5-0001-Enable-setting-pg_hba.conf-permissions-from-initd.patch) does
> not apply successfully.
> There are two reasons first is it was not generated with proper "-p"
> which confused cfbot. Second, after
> fixing that issue you still need to rebase that.
>
>
> http://cfbot.cputube.org/patch_32_2916.log
> 
>
> |+++ doc/src/sgml/ref/initdb.sgml
> --
> No file to patch.  Skipping patch.
> 1 out of 1 hunk ignored
> can't find file to patch at input line 77
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
>
>
> Can we get a rebase?
>
> I am marking the patch "Waiting on Author"



What is the point of doing that if we're going to reject the patch as
discussed upthread?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: PATCH: Batch/pipelining support for libpq

2021-03-08 Thread 'Alvaro Herrera'
On 2021-Mar-03, 'Alvaro Herrera' wrote:

> I wonder if it would make sense to get rid of conn->last_query
> completely and just rely always on conn->cmd_queue_head, where the
> normal (non-pipeline) would use the first entry of the command queue.
> That might end up being simpler than pipeline mode "pretending" to take
> over ->last_query.

I experimented with this today and it appears to be a good change,
though I haven't been able to make everything work correctly yet.

-- 
Álvaro Herrera   Valdivia, Chile
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"




Re: proposal - operators ? and ->> for type record, and functions record_keys and record_each_text

2021-03-08 Thread Pavel Stehule
po 8. 3. 2021 v 23:12 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > I found an interesting idea to have some basic functions and operators
> for
> > record type (similar to json, jsonb or hstore).
>
> I think this is a pretty bad idea, because there's no way to know what
> data type the result of -> should be.  "Smash it all to text" is a hack,
> not a solution --- and if you find that hack satisfactory, you might as
> well be using json or hstore.
>

I wrote (and sent) an implementation of generic type, that can hold any
type in binary form, and that can reduce IO casts. It can be more effective
than text, but an usability is the same like json or text, because you have
to use explicit casts everywhere. I think other solutions are not possible,
because you don't know the real type before an evaluation.


>
> Most of the other things you mention are predicated on the assumption
> that the field set will vary from one value to the next, which again
> seems more like something you'd do with json or hstore than with SQL
> composites.
>

I am thinking about effectiveness in triggers. NEW and OLD variables are of
record type, and sometimes you need to do operation just on tupledesc. When
I work with a record type, I can do it, without any overhead. When I need
to use jsonb or hstore, I have to pay, because all fields should be
transformated.

Minimally the operator "?" can be  useful. It allows access to statically
specified fields without risk of exception. So I can write universal
trigger with

IF NEW ? 'fieldx' THEN
   RAISE NOTICE '%', NEW.fieldx ;

and this operation can be fast and safe



> regards, tom lane
>


Re: PATCH: Batch/pipelining support for libpq

2021-03-08 Thread David G. Johnston
On Wed, Mar 3, 2021 at 5:45 PM Justin Pryzby  wrote:

> I'm proposing some minor changes.
>
>
Some additional tweaks/comments for the documentation with the edit
proposed edits:

(edit) + PQresultStatus, will report a

Remove the comma

(orig) + the failed operation are skipped entirely. The same behaviour
holds if the

We seem to use "behavior" more frequently

(edit) + From the client perspective, after
PQresultStatus

Possessive "client's perspective"

(edit) + its expected results queue.  Based on available memory,
results from the

"its corresponding results queue" - to go along with this change:
- of the order in which it sent queries and the expected results.
+ of the order in which it sent queries, to associate them with their
+ corresponding results.

(orig)
+   pipeline mode. If the current statement isn't finished processing
+   or there are results pending for collection with

Needs a comma after processing.
"results pending for collection" reads oddly to me...not sure what would be
better though...

(edit)
+   
+
+ The pipeline API was introduced in
PostgreSQL 14.
+ Pipeline mode is a client-side feature which doesn't require server
+ support, and works on any server that supports the v3 extended query
+ protocol.
+ 
+   

This note seems like it should be placed either near the very beginning of
the feature or incorporated into the feature introduction.
(orig)
+ If any statement encounters an error, the server aborts the current
+(-) transaction and skips processing commands in the pipeline until the
+ next synchronization point established by
PQsendPipeline.

I dislike "skips" here - even if it doesn't execute a command it still will
place a result on the socket so that the client can have something to match
up with the queries it sent, correct?

+ transaction and creates a PGRES_PIPELINE_ABORTED result for all commands
in the pipeline until the

David J.


Re: Removing vacuum_cleanup_index_scale_factor

2021-03-08 Thread Peter Geoghegan
On Mon, Mar 8, 2021 at 1:38 PM Tom Lane  wrote:
> As you say, the history here is a bit convoluted, but it seems like
> a good principle to avoid interconnections between VACUUM and ANALYZE
> as much as we can.  I haven't been paying enough attention to this
> thread to have more insight than that.

The attached patch does what I proposed earlier today: it teaches
do_analyze_rel() to always set pg_class.reltuples for indexes when it
would do the same thing for the heap/table relation already. It's now
uniform in that sense.

Also included is a patch that removes the
vacuum_cleanup_index_scale_factor mechanism for triggering an index
scan during VACUUM -- that's what the second patch does (this depends
on the first patch, really).

Do you think that a backpatch to Postgres 13 for both of these patches
would be acceptable? There are two main concerns that I have in mind
here, both of which are only issues in Postgres 13:

1. Arguably the question of skipping scanning the index should have been
considered by the autovacuum_vacuum_insert_scale_factor patch when it
was committed for Postgres 13 -- but it wasn't. There is a regression
that was tied to autovacuum_vacuum_insert_scale_factor in Postgres 13
by Mark Callaghan:

https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html

The blog post says: "Updates - To understand the small regression
mentioned above for the l.i1 test (more CPU & write IO) I repeated the
test with 100M rows using 2 configurations: one disabled index
deduplication and the other disabled insert-triggered autovacuum.
Disabling index deduplication had no effect and disabling
insert-triggered autovacuum resolves the regression."

I think that this regression is almost entirely explainable by the
need to unnecessarily scan indexes for autovacuum VACUUMs that just
need to set the visibility map. This issue is basically avoidable,
just by removing the vacuum_cleanup_index_scale_factor cleanup-only
VACUUM criteria (per my second patch).

2. I fixed a bug in nbtree deduplication btvacuumcleanup() stats in
commit 48e12913. This fix still left things in kind of a bad state:
there are still cases where the btvacuumcleanup()-only VACUUM case
will set pg_class.reltuples to a value that is significantly below
what it should be (it all depends on how effective deduplication is
with the data). These remaining cases are effectively fixed by the
second patch.

I probably should have made btvacuumcleanup()-only VACUUMs set
"stats->estimate_count = true" when I was working on the fix that
became commit 48e12913. Purely because my approach was inherently
approximate with posting list tuples, and so shouldn't be trusted for
anything important (num_index_tuples is suitable for VACUUM VERBOSE
output only in affected cases). I didn't set "stats->estimate_count =
true" in affected cases because I was worried about unforeseen
consequences. But this seems defensible now, all things considered.

There are other things that are slightly broken but will be fixed by
the first patch. But I'm really just worried about these two cases in
Postgres 13.

Thanks for weighing in
--
Peter Geoghegan
From 29079ec92eed077c8d4d65e4c3db16ce41578657 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Mon, 1 Mar 2021 14:40:57 -0800
Subject: [PATCH 1/2] VACUUM ANALYZE: Always set pg_class.reltuples.

Stop assuming that pg_class must have been updated with accurate
statistics within VACUUM ANALYZE -- update pg_class for indexes at the
same time as the table relation in all cases.

VACUUM sometimes gives less accurate statistics than ANALYZE, or no
statistics at all.  It seems advisable to effectively assume that that
always happens.  The worst that can happen is that we'll inaccurately
set pg_class.reltuples for indexes whose heap relation would already
have had an inaccurate pg_class.reltuples anyway.

Author: Peter Geoghegan 
Discussion: CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com">https://postgr.es/m/CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c |  3 +-
 src/backend/commands/analyze.c   | 49 
 2 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d8f847b0e6..22ca7f5c54 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1737,7 +1737,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		end_parallel_vacuum(indstats, lps, nindexes);
 
 	/* Update index statistics */
-	update_index_statistics(Irel, indstats, nindexes);
+	if (vacrelstats->useindex)
+		update_index_statistics(Irel, indstats, nindexes);
 
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 7295cf0215..7ae1cd426c 100644
--- 

Re: Proposal: Save user's original authenticated identity for logging

2021-03-08 Thread Jacob Champion
On Sat, 2021-03-06 at 18:33 +0100, Magnus Hagander wrote:
> It looks like patch 0001 has some leftover debuggnig code at the end?
> Or did you intend for that to be included permanently?

I'd intended to keep it -- it works hand-in-hand with the existing
"current_logfiles" log line on 219 and might keep someone from tearing
their hair out. But I can certainly remove it, if it's cluttering up
the logs too much.

> As for log escaping, we report port->user_name already unescaped --
> surely this shouldn't be a worse case than that?

Ah, that's a fair point. I'll remove the TODO.

> I wonder if it wouldn't be better to keep the log line on the existing
> "connection authorized" line, just as a separate field. I'm kind of
> split on it though, because I guess it might make that line very long.
> But it's also a lot more convenient to parse it on a single line than
> across multiple lines potentially overlapping with other sessions.

Authentication can succeed even if authorization fails, and it's useful
to see that in the logs. In most cases that looks like a failed user
mapping, but there are other corner cases where we fail the connection
after a successful authentication, such as when using krb_realm.
Currently you get little to no feedback when that happens, but with a
separate log line, it's a lot easier to piece together what's happened.

(In general, I feel pretty strongly that Postgres combines/conflates
authentication and authorization in too many places.)

> With this we store the same value as the authn and as
> port->gss->princ, and AFAICT it's only used once. Seems we could just
> use the new field for the gssapi usage as well? Especially since that
> usage only seems to be there in order to do the gssapi specific
> logging of, well, the same thing.
> 
> Same goes for peer_user? In fact, if we're storing it in the Port, why
> are we even passing it as a separate parameter to check_usermap --
> shouldn't that one always use this same value? ISTM that it could be
> quite confusing if the logged value is different from whatever we
> apply to the user mapping?

Seems reasonable; I'll consolidate them.

--Jacob


Re: proposal - operators ? and ->> for type record, and functions record_keys and record_each_text

2021-03-08 Thread Tom Lane
Pavel Stehule  writes:
> I found an interesting idea to have some basic functions and operators for
> record type (similar to json, jsonb or hstore).

I think this is a pretty bad idea, because there's no way to know what
data type the result of -> should be.  "Smash it all to text" is a hack,
not a solution --- and if you find that hack satisfactory, you might as
well be using json or hstore.

Most of the other things you mention are predicated on the assumption
that the field set will vary from one value to the next, which again
seems more like something you'd do with json or hstore than with SQL
composites.

regards, tom lane




Re: Removing vacuum_cleanup_index_scale_factor

2021-03-08 Thread Tom Lane
Peter Geoghegan  writes:
> I think that a simpler approach would work better: When
> ANALYZE/do_analyze_rel() decides whether or not it should call
> vac_update_relstats() for each index, it should simply not care
> whether or not this is a VACUUM ANALYZE (as opposed to a simple
> ANALYZE). This is already what we do for the heap relation itself. Why
> shouldn't we do something similar for indexes?

> What do you think, Tom? Your bugfix commit b4b6923e03f from 2011
> taught do_analyze_rel() to not care about whether VACUUM took place
> earlier in the same command -- though only in the case of the heap
> relation (not in the case of its indexes). That decision now seems a
> bit arbitrary to me.

Well, nobody had complained about the index stats at that point,
so I don't think I was thinking about that aspect of it.

As you say, the history here is a bit convoluted, but it seems like
a good principle to avoid interconnections between VACUUM and ANALYZE
as much as we can.  I haven't been paying enough attention to this
thread to have more insight than that.

regards, tom lane




proposal - operators ? and ->> for type record, and functions record_keys and record_each_text

2021-03-08 Thread Pavel Stehule
Hi

When I wrote an reply to questing

https://stackoverflow.com/questions/66523737/postgresql-10-pl-pgsql-test-if-column-exits-in-a-record-variable

I found an interesting idea to have some basic functions and operators for
record type (similar to json, jsonb or hstore).

Now we can do almost all tasks on record type by cast to jsonb type. But
this transformation has some overhead (and for some tasks is not
necessary), and it is not too intuitive too.

I don't think so we need full functionality like hstore or jsonb (minimally
because record type cannot be persistent and indexed), but some basic
functionality can be useful.

-- tests of basic helper functions for record type
do $$
declare
  r record;
  k text; v text; t text;
begin
  select oid, relname, relnamespace, reltype from pg_class limit 1 into r;
  if not r ? 'xxx' then
raise notice 'pg_class has not column xxx';
  end if;

  if r ? 'relname' then
raise notice 'pg_class has column relname';
  end if;

  foreach k in array record_keys_array(r)
  loop
raise notice '% => %', k, r->>k;
  end loop;

  raise notice '---';

  -- second (slower) variant
  for k in select * from record_keys(r)
  loop
raise notice '% => %', k, r->>k;
  end loop;

  raise notice '---';

  -- complete unpacking
  for k, v, t in select * from record_each_text(r)
  loop
raise notice '% => %(%)', k, v, t;
  end loop;
end;
$$;

What do you think about this proposal?

Comments, notes?

Regards

Pavel
diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index 23787a6ae7..664ccc6c44 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -20,9 +20,11 @@
 #include "access/htup_details.h"
 #include "catalog/pg_type.h"
 #include "common/hashfn.h"
+#include "executor/spi.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
@@ -67,6 +69,15 @@ typedef struct RecordCompareData
 	ColumnCompareData columns[FLEXIBLE_ARRAY_MEMBER];
 } RecordCompareData;
 
+/*
+ * Structure used for list of keys
+ */
+typedef struct
+{
+	ArrayBuildState *astate;
+	Tuplestorestate *tupstore;
+	TupleDesc	tupdesc;
+} RecordKeysOutputData;
 
 /*
  * record_in		- input routine for any composite type.
@@ -2015,3 +2026,360 @@ hash_record_extended(PG_FUNCTION_ARGS)
 
 	PG_RETURN_UINT64(result);
 }
+
+/*
+ * Few simple functions and operators for work with record type.
+ */
+
+/*
+ * Returns true, if record has a field on top level.
+ */
+Datum
+record_exists(PG_FUNCTION_ARGS)
+{
+	HeapTupleHeader rec = PG_GETARG_HEAPTUPLEHEADER(0);
+	char	   *fname = text_to_cstring(PG_GETARG_TEXT_PP(1));
+	Oid			tupType;
+	int32		tupTypmod;
+	TupleDesc	tupdesc;
+	bool		result = false;
+	int		i;
+
+	/* Extract type info from the tuple itself */
+	tupType = HeapTupleHeaderGetTypeId(rec);
+	tupTypmod = HeapTupleHeaderGetTypMod(rec);
+	tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
+
+	for (i = 0; i < tupdesc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(tupdesc, i);
+		char	   *attname;
+
+		if (att->attisdropped)
+			continue;
+
+		attname = NameStr(att->attname);
+
+		if (strcmp(fname, attname) == 0)
+		{
+			result = true;
+			break;
+		}
+	}
+
+	ReleaseTupleDesc(tupdesc);
+
+	PG_RETURN_BOOL(result);
+}
+
+/*
+ * Returns text value of record field. Raise an error when required
+ * key doesn't exists
+ */
+Datum
+record_field_text(PG_FUNCTION_ARGS)
+{
+	HeapTupleHeader rec = PG_GETARG_HEAPTUPLEHEADER(0);
+	char	   *fname = text_to_cstring(PG_GETARG_TEXT_P(1));
+	HeapTupleData	tuple;
+	Oid			tupType;
+	int32		tupTypmod;
+	TupleDesc	tupdesc;
+	int			fno;
+	bool		isnull = true;
+	char	   *outstr = NULL;
+
+	/* Extract type info from the tuple itself */
+	tupType = HeapTupleHeaderGetTypeId(rec);
+	tupTypmod = HeapTupleHeaderGetTypMod(rec);
+	tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
+
+	/* Build a temporary HeapTuple control structure */
+	tuple.t_len = HeapTupleHeaderGetDatumLength(rec);
+	ItemPointerSetInvalid(&(tuple.t_self));
+	tuple.t_tableOid = InvalidOid;
+	tuple.t_data = rec;
+
+	fno = SPI_fnumber(tupdesc, fname);
+	if (fno != SPI_ERROR_NOATTRIBUTE)
+	{
+		Datum	value = SPI_getbinval(, tupdesc, fno, );
+
+		if (!isnull)
+		{
+			bool	typIsVarlena;
+			Oid		typoutput;
+			FmgrInfo		proc;
+			Oid	typeid;
+
+			typeid = SPI_gettypeid(tupdesc, fno);
+			getTypeOutputInfo(typeid, , );
+			fmgr_info(typoutput, );
+			outstr = OutputFunctionCall(, value);
+		}
+	}
+
+	ReleaseTupleDesc(tupdesc);
+
+	if (!isnull)
+	{
+		Assert(outstr);
+
+		PG_RETURN_TEXT_P(cstring_to_text(outstr));
+	}
+
+	PG_RETURN_NULL();
+}
+
+/*
+ * Returns set of composite type (key text, value text, type text) of all
+ * fields in record.
+ */
+Datum
+record_each_text(PG_FUNCTION_ARGS)
+{
+	HeapTupleHeader rec = PG_GETARG_HEAPTUPLEHEADER(0);
+	HeapTupleData	tuple;
+	Oid			tupType;
+	int32		tupTypmod;
+	TupleDesc	tupdesc;
+	

Re: [HACKERS] Custom compression methods

2021-03-08 Thread Robert Haas
On Mon, Mar 8, 2021 at 3:59 PM Justin Pryzby  wrote:
> > It would be nice to have a way to force
> > anything compressed with the old method to be re-compressed with the
> > new method, but not having that doesn't preclude allowing the
> > parameter to be changed.
>
> Doesn't vacuum full/cluster/dump+restore do that ?

Well, dump and restore will do it, certainly, but I don't think VACUUM
FULL or CLUSTER will. I haven't tested it, though, so maybe I'm wrong.

> > I think the pg_dump argument should be --no-toast-compression, not
> > --no-toast-compressions. I agree with Justin that pg_restore should
> > have the option also.
>
> I mentioned that this is hard to do, since the compression is stored inside 
> the
> text blob that creates the whole table...Unless toast compression is a
> per-relation property rather than per-attribute.  I don't think pg_restore
> should try to reverse-engineer the text output by pg_dump to elide the
> "COMPRESSION lz4".

Oh, yeah. I guess we have to leave that out then.

> I think maybe CREATE shouldn't support COMPRESSION at all, and pg_dump/restore
> would use ALTER.  That makes this very slightly less of an issue, as one can
> use pg_restore -f- |grep -v '^ALTER TABLE .* SET COMPRESSION' |psql -d,
> rather than sed 's/COMPRESSION lz4//'

TBH, that doesn't seem very nice to me. I think it's usually better to
create objects with the right properties initially rather than
creating them wrong and then fixing them afterwards.

> > Man, it would be really nice to be able to set the default for new
> > tables, rather than having all these places hard-coded to use
> > DefaultCompressionMethod. Surely lotsa people are going to want to set
> > toast_compression = lz4 in postgresql.conf and forget about it.
>
> I don't understand - isn't that what 0002 does ?

Oh, man, you want me to look at all the patches and not just the first one? :-)

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Removing vacuum_cleanup_index_scale_factor

2021-03-08 Thread Peter Geoghegan
On Tue, Mar 2, 2021 at 6:01 PM Peter Geoghegan  wrote:
> 1. Any objections to the idea of teaching VACUUM ANALYZE to
> distinguish between the cases where VACUUM ran and performed "real
> index vacuuming", to make it more intelligent about overwriting
> pg_class stats for indexes?

I think that a simpler approach would work better: When
ANALYZE/do_analyze_rel() decides whether or not it should call
vac_update_relstats() for each index, it should simply not care
whether or not this is a VACUUM ANALYZE (as opposed to a simple
ANALYZE). This is already what we do for the heap relation itself. Why
shouldn't we do something similar for indexes?

What do you think, Tom? Your bugfix commit b4b6923e03f from 2011
taught do_analyze_rel() to not care about whether VACUUM took place
earlier in the same command -- though only in the case of the heap
relation (not in the case of its indexes). That decision now seems a
bit arbitrary to me.

I should point out that this is the *opposite* of what we did from
2004 - 2011 (following Tom's 2004 commit f0c9397f808). During that
time the policy was to not update pg_class.reltuples inside
do_analyze_rel() when we knew that VACUUM ran. The policy was at least
the same for indexes and the heap/table during this period, so it was
consistent in that sense. However, I don't think that we should
reintroduce that policy now. Doing so would be contrary to the API
contract for index AMs established by Tom's 2006 commit e57345975cf --
that allowed amvacuumcleanup() to be a no-op when there was no
ambulkdelete() call (it also taught hashvacuumcleanup() to do just
that).

To recap, our ultimate goal here is to make btvacuumcleanup() close to
hashvacuumcleanup() -- it should be able to skip all cleanup when
there was no btbulkdelete() call during the same VACUUM (nbtree page
deletion still has cases that force us to do real work in the absence
of a btbulkdelete() call for the VACUUM, but that remaining exception
should be very rare).

-- 
Peter Geoghegan




Re: [HACKERS] Custom compression methods

2021-03-08 Thread Justin Pryzby
On Mon, Mar 08, 2021 at 03:26:04PM -0500, Robert Haas wrote:
> On Mon, Mar 8, 2021 at 5:02 AM Dilip Kumar  wrote:
> > So now only pending point is,  how do we handle the upgrade when you
> > are upgrading from --with-lz4 to --without-lz4 binary and a couple of
> > options discussed here are
> > a) Should we allow table creation with lz4 even if it is compiled
> > --without-lz4?  In case of xml we always allow table creation even if
> > it is compiled --wthout-libxml
> > b) Instead of allowing this always, only allow during binary upgrade.

> It would be nice to have a way to force
> anything compressed with the old method to be re-compressed with the
> new method, but not having that doesn't preclude allowing the
> parameter to be changed.

Doesn't vacuum full/cluster/dump+restore do that ?

> I think the pg_dump argument should be --no-toast-compression, not
> --no-toast-compressions. I agree with Justin that pg_restore should
> have the option also.

I mentioned that this is hard to do, since the compression is stored inside the
text blob that creates the whole table...Unless toast compression is a
per-relation property rather than per-attribute.  I don't think pg_restore
should try to reverse-engineer the text output by pg_dump to elide the
"COMPRESSION lz4".

I think maybe CREATE shouldn't support COMPRESSION at all, and pg_dump/restore
would use ALTER.  That makes this very slightly less of an issue, as one can
use pg_restore -f- |grep -v '^ALTER TABLE .* SET COMPRESSION' |psql -d,
rather than sed 's/COMPRESSION lz4//'

> Man, it would be really nice to be able to set the default for new
> tables, rather than having all these places hard-coded to use
> DefaultCompressionMethod. Surely lotsa people are going to want to set
> toast_compression = lz4 in postgresql.conf and forget about it.

I don't understand - isn't that what 0002 does ?

Subject: [PATCH v32 2/4] Add default_toast_compression GUC  

   

-- 
Justin




Re: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Pavel Stehule
po 8. 3. 2021 v 21:12 odesílatel Tom Lane  napsal:

> "Joel Jacobson"  writes:
> > If I understand it correctly, we don't need to run genbki.pl to compile
> PostgreSQL,
> > so someone wanting to compile PostgreSQL without having a running
> PostgreSQL-instance
> > could do so without problems.
> > A dependency on having a PostgreSQL instance running,
> > is perhaps acceptable for hackers developing PostgreSQL?
> > But of course not for normal users just wanting to compile PostgreSQL.
> > If we think there is at least a 1% chance this is a feasible idea,
> > I'm willing to try implementing a SQL/PLpgSQL-version of genbki.pl.
>
> No, I think this is a non-starter.  Bootstrapping from just the
> contents of the git repo is something developers do all the time
> (and indeed the buildfarm does it in every run).  We do not want to
> need a running PG instance in advance of doing that.
>
> Yeah, we could make it work if we started treating all the genbki
> output files as things to include in the git repo, but I don't think
> anybody wants to go there.
>
> I understand some folks' distaste for Perl, and indeed I don't like it
> that much myself.  If we were starting over from scratch I'm sure
> we'd choose a different language for our build/test infrastructure.
> But that's where we are, and I would not be in favor of having more
> than one scripting language as build requirements.  So Perl is going
> to be it unless somebody gets ambitious enough to replace all the Perl
> scripts at once, which seems unlikely to happen.
>
> > I agree, I like the 2-D array version, but only if a we could provide a
> C-function
> > to allow unnesting N+1 dims to N dims. Is that a fruitful idea, or are
> there
> > reasons why it cannot be done easily? I could give it a try, if we think
> it's a good idea.
>
> +1, I think this need has come up before.  My guess is that the
> hardest part of that will be choosing a function name that will
> satisfy everybody ;-).
>
> Could there be any value in allowing unnesting a variable number
> of levels?  If so, we could dodge the naming issue by inventing
> "unnest(anyarray, int) returns anyarray" where the second argument
> specifies the number of subscript levels to remove, or perhaps
> the number to keep.
>

so what about?

CREATE OR REPLACE FUNCTION unnest_slice(anyarray, int)
RETURNS SETOF anyarray AS $$
DECLARE r $1%type;
BEGIN
  FOREACH r SLICE $2 IN ARRAY $1 --- now $2 should be constant
  LOOP
RETURN NEXT r;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

Regards

Pavel

regards, tom lane
>
>
>


Re: Confusing behavior of psql's \e

2021-03-08 Thread Jacob Champion
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Very nice quality-of-life improvement. Thanks!

The new status of this patch is: Ready for Committer


Re: Implementing Incremental View Maintenance

2021-03-08 Thread Andrew Dunstan


On 2/18/21 9:01 PM, Yugo NAGATA wrote:
> On Thu, 18 Feb 2021 19:38:44 +0800
> Andy Fan  wrote:
>
>> On Tue, Feb 16, 2021 at 9:33 AM Yugo NAGATA  wrote:
>>
>>> Hi,
>>>
>>> Attached is a rebased patch (v22a).
>>>
>> Thanks for the patch. Will you think posting a patch with the latest commit
>> at that
>> time is helpful? If so, when others want to review it,  they know which
>> commit to
>> apply the patch without asking for a new rebase usually.
> I rebased the patch because cfbot failed.
> http://cfbot.cputube.org/
>

It's bitrotted a bit more dues to commits bb437f995d and 25936fd46c


(A useful feature of the cfbot might be to notify the authors and
reviewers when it detects bitrot for a previously passing entry.)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: pg_amcheck contrib application

2021-03-08 Thread Mark Dilger



> On Mar 8, 2021, at 8:26 AM, Robert Haas  wrote:
> 
> On Thu, Mar 4, 2021 at 5:39 PM Mark Dilger  
> wrote:
>> I think Robert mistook why I was doing that.  I was thinking about a 
>> different usage pattern.  If somebody thinks a subset of relations have been 
>> badly corrupted, but doesn't know which relations those might be, they might 
>> try to find them with pg_amcheck, but wanting to just check the first few 
>> blocks per relation in order to sample the relations.  So,
>> 
>>  pg_amcheck --startblock=0 --endblock=9 --no-dependent-indexes
>> 
>> or something like that.  I don't think it's very fun to have it error out 
>> for each relation that doesn't have at least ten blocks, nor is it fun to 
>> have those relations skipped by error'ing out before checking any blocks, as 
>> they might be the corrupt relations you are looking for.  But using 
>> --startblock and --endblock for this is not a natural fit, as evidenced by 
>> how I was trying to "fix things up" for the user, so I'll punt on this usage 
>> until some future version, when I might add a sampling option.
> 
> I admit I hadn't thought of that use case. I guess somebody could want
> to do that, but it doesn't seem all that useful. Checking the first
> up-to-ten blocks of every relation is not a very representative
> sample, and it's not clear to me that sampling is a good idea even if
> it were representative. What good is it to know that 10% of my
> database is probably not corrupted?


`cd $PGDATA; tar xfz my_csv_data.tgz` ctrl-C ctrl-C ctrl-C
`rm -rf $PGDATA` ctrl-C ctrl-C ctrl-C
`/my/stupid/backup/and/restore/script.sh` ctrl-C ctrl-C ctrl-C

# oh wow, i wonder if any relations got overwritten with csv file data, or had 
their relation files unlinked, or ...?

`pg_amcheck --jobs=8 --startblock=0 --endblock=10`

# ah, darn, it's spewing lots of irrelevant errors because some relations are 
too short

`pg_amcheck --jobs=8 --startblock=0 --endblock=0`

# ah, darn, it's still spewing lots of irrelevant errors because I have lots of 
indexes with zero blocks of data

`pg_amcheck --jobs=8`

# ah, darn, it's taking forever, because it's processing huge tables in their 
entirety

I agree this can be left to later, and the --startblock and --endblock options 
are the wrong way to do it.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [HACKERS] Custom compression methods

2021-03-08 Thread Robert Haas
On Mon, Mar 8, 2021 at 5:02 AM Dilip Kumar  wrote:
> So now only pending point is,  how do we handle the upgrade when you
> are upgrading from --with-lz4 to --without-lz4 binary and a couple of
> options discussed here are
> a) Should we allow table creation with lz4 even if it is compiled
> --without-lz4?  In case of xml we always allow table creation even if
> it is compiled --wthout-libxml
> b) Instead of allowing this always, only allow during binary upgrade.

I think the basic answer to (a) that it doesn't matter. Suppose the
user is not upgrading but just feels like creating a table that is
configured to use LZ4 compression. Does it really matter whether they
get the error when they create the table or when they load the data?
Personally, I think it is slightly more user-friendly to give the
error when they try to create the table, because the problem doesn't
occur when inserting ANY row, but only when inserting rows that are
wide enough that compression will occur. It's not that great to have
people create a table and then find out only much later that it
doesn't work. On the other hand, consistency with the way the xml data
type already works seems like a fair enough argument for letting the
error happen when we try to actually use the compression method. So I
can't get worked up about it either way.

Regarding (b), it seems to me that with this approach, we have to
document that pg_upgrade from binaries that support LZ4 to binaries
that don't support LZ4 is fundamentally unsafe. You might have
LZ4-compressed values in your columns even if they are now set to use
PGLZ, and you might have LZ4'd data inside composite values that are
on disk someplace. We have no idea whether those things are true or
not, and we can't prevent you from upgrading to something that makes
part of your data inaccessible. Given that, if we go with this
approach, I think we should expend exactly 0 code trying to making
pg_upgrade pass in cases where there are LZ4 columns in the database
and the new binaries don't support LZ4. Just because the user goes and
gets rid of all the LZ4 columns before upgrading doesn't mean that the
upgrade is safe, but if they haven't even done that much, maybe they
should reconsider things a bit.

Some other review comments:

toast_get_compression_method() should now return char, not Oid.

With this design, we can support changing the compression method on a
column quite easily. It's just a hint, like the STORAGE parameter. It
has no bearing on what can be present in the table, but just controls
how new values are stored. It would be nice to have a way to force
anything compressed with the old method to be re-compressed with the
new method, but not having that doesn't preclude allowing the
parameter to be changed.

I am tempted to propose that we collapse compress_lz4.c and
compress_pglz.c into a single file, get rid of the directory, and just
have something like src/backend/access/common/toast_compression.c. The
files are awfully short, and making a whole new directory for that
small amount of code seems like overkill.

I think the pg_dump argument should be --no-toast-compression, not
--no-toast-compressions. I agree with Justin that pg_restore should
have the option also.

Man, it would be really nice to be able to set the default for new
tables, rather than having all these places hard-coded to use
DefaultCompressionMethod. Surely lotsa people are going to want to set
toast_compression = lz4 in postgresql.conf and forget about it.

Is there any reason not to change varattrib_4b's description of
va_tcinfo that says "and flags" to instead say "and compression
method"? And rename VARFLAGS_4B_C to VARCOMPRESS_4B_C? I don't know
why we should call it flags when we know it's specifically compression
information.

You should probably have a test that involves altering the type of a
varlena column to non-varlena, and the other way around, and make sure
that changing integer -> text sets attcompression and doing the
reverse clears it.

You need to update catalogs.sgml.

On the whole I don't see a whole lot to complain about here. I don't
love giving up on the idea of tracking which compression methods are
used where, but making that work without performance regressions seems
very difficult and perhaps just outright impossible, and dealing with
all the concurrency problems that introduces is a pain, too. I think
accepting a feature that gives us LZ4 compression is better than
rejecting it because we can't solve those problems.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Tom Lane
"Joel Jacobson"  writes:
> If I understand it correctly, we don't need to run genbki.pl to compile 
> PostgreSQL,
> so someone wanting to compile PostgreSQL without having a running 
> PostgreSQL-instance
> could do so without problems.
> A dependency on having a PostgreSQL instance running,
> is perhaps acceptable for hackers developing PostgreSQL?
> But of course not for normal users just wanting to compile PostgreSQL.
> If we think there is at least a 1% chance this is a feasible idea,
> I'm willing to try implementing a SQL/PLpgSQL-version of genbki.pl.

No, I think this is a non-starter.  Bootstrapping from just the
contents of the git repo is something developers do all the time
(and indeed the buildfarm does it in every run).  We do not want to
need a running PG instance in advance of doing that.

Yeah, we could make it work if we started treating all the genbki
output files as things to include in the git repo, but I don't think
anybody wants to go there.

I understand some folks' distaste for Perl, and indeed I don't like it
that much myself.  If we were starting over from scratch I'm sure
we'd choose a different language for our build/test infrastructure.
But that's where we are, and I would not be in favor of having more
than one scripting language as build requirements.  So Perl is going
to be it unless somebody gets ambitious enough to replace all the Perl
scripts at once, which seems unlikely to happen.

> I agree, I like the 2-D array version, but only if a we could provide a 
> C-function
> to allow unnesting N+1 dims to N dims. Is that a fruitful idea, or are there
> reasons why it cannot be done easily? I could give it a try, if we think it's 
> a good idea.

+1, I think this need has come up before.  My guess is that the
hardest part of that will be choosing a function name that will
satisfy everybody ;-).

Could there be any value in allowing unnesting a variable number
of levels?  If so, we could dodge the naming issue by inventing
"unnest(anyarray, int) returns anyarray" where the second argument
specifies the number of subscript levels to remove, or perhaps
the number to keep.

regards, tom lane




Re: About to add WAL write/fsync statistics to pg_stat_wal view

2021-03-08 Thread David G. Johnston
On Mon, Mar 8, 2021 at 8:48 AM Fujii Masao 
wrote:

>
> Thanks for updating the patch! I applied cosmetic changes to that.
> Patch attached. Barring any objection, I will commit this version.
>

Read over the patch and it looks good.

One minor "the" omission (in a couple of places, copy-paste style):

+   See  for more information about
+   internal WAL function XLogWrite.

"about *the* internal WAL function"

Also, I'm not sure why you find omitting documentation that the millisecond
field has a fractional part out to microseconds to be helpful.

David J.


Re: automatic analyze: readahead - add "IO read time" log message

2021-03-08 Thread Stephen Frost
Greetings,

* Tomas Vondra (tomas.von...@enterprisedb.com) wrote:
> On 2/10/21 11:10 PM, Stephen Frost wrote:
> > * Heikki Linnakangas (hlinn...@iki.fi) wrote:
> >> On 05/02/2021 23:22, Stephen Frost wrote:
> >>> Unless there's anything else on this, I'll commit these sometime next
> >>> week.
> >>
> >> One more thing: Instead of using 'effective_io_concurrency' GUC directly,
> >> should call get_tablespace_maintenance_io_concurrency().
> > 
> > Ah, yeah, of course.
> > 
> > Updated patch attached.
> 
> A couple minor comments:
> 
> 1) I think the patch should be split into two parts, one adding the
> track_io_timing, one adding the prefetching.

This was already done..

> 2) I haven't tried but I'm pretty sure there'll be a compiler warning
> about 'prefetch_maximum' being unused without USE_PREFETCH defined.

Ah, that part is likely true, moved down into the #ifdef block to
address that, which also is good since it should avoid mistakenly using
it outside of the #ifdef's later on by mistake too.

> 3) Is there a way to reduce the amount of #ifdef in acquire_sample_rows?

Perhaps..

> This makes the code rather hard to read, IMHO. It seems to me we can
> move the code around a bit and merge some of the #ifdef blocks - see the
> attached patch. Most of this is fairly trivial, with the exception of
> moving PrefetchBuffer before table_scan_analyze_next_block - AFAIK this
> does not materially change the behavior, but perhaps I'm wrong.

but I don't particularly like doing the prefetch right before we run
vacuum_delay_point() and potentially sleep.

Rebased and updated patch attached.

Thanks!

Stephen
From ee8e7d0bfa28b8b41b8d9995f64f881d9ce351bd Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Fri, 5 Feb 2021 15:59:02 -0500
Subject: [PATCH 1/2] Improve logging of auto-vacuum and auto-analyze

When logging auto-vacuum and auto-analyze activity, include the I/O
timing if track_io_timing is enabled.  Also, for auto-analyze, add the
read rate and the dirty rate, similar to how that information has
historically been logged for auto-vacuum.

Stephen Frost and Jakub Wartak

Reviewed-By: Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/VI1PR0701MB69603A433348EDCF783C6ECBF6EF0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
---
 doc/src/sgml/config.sgml |   8 ++-
 src/backend/access/heap/vacuumlazy.c |  18 +
 src/backend/commands/analyze.c   | 100 +--
 3 files changed, 116 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 967de73596..492ed78e1c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7457,9 +7457,11 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
 I/O timing information is
 displayed in 
 pg_stat_database, in the output of
- when the BUFFERS option is
-used, and by .  Only superusers can
-change this setting.
+ when the BUFFERS option
+is used, by autovacuum for auto-vacuums and auto-analyzes, when
+ is set and by
+.  Only superusers can change this
+setting.

   
  
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d8f847b0e6..ab40be0771 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -440,6 +440,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	TransactionId new_frozen_xid;
 	MultiXactId new_min_multi;
 	ErrorContextCallback errcallback;
+	PgStat_Counter startreadtime = 0;
+	PgStat_Counter startwritetime = 0;
 
 	Assert(params != NULL);
 	Assert(params->index_cleanup != VACOPT_TERNARY_DEFAULT);
@@ -454,6 +456,11 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 	{
 		pg_rusage_init();
 		starttime = GetCurrentTimestamp();
+		if (track_io_timing)
+		{
+			startreadtime = pgStatBlockReadTime;
+			startwritetime = pgStatBlockWriteTime;
+		}
 	}
 
 	if (params->options & VACOPT_VERBOSE)
@@ -674,6 +681,17 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 			 (long long) VacuumPageDirty);
 			appendStringInfo(, _("avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n"),
 			 read_rate, write_rate);
+			if (track_io_timing)
+			{
+appendStringInfoString(, _("I/O Timings:"));
+if (pgStatBlockReadTime - startreadtime > 0)
+	appendStringInfo(, _(" read=%.3f"),
+	 (double) (pgStatBlockReadTime - startreadtime) / 1000);
+if (pgStatBlockWriteTime - startwritetime > 0)
+	appendStringInfo(, _(" write=%.3f"),
+	 (double) (pgStatBlockWriteTime - startwritetime) / 1000);
+appendStringInfoChar(, '\n');
+			}
 			appendStringInfo(, _("system usage: %s\n"), pg_rusage_show());
 			appendStringInfo(,
 			 _("WAL usage: %ld records, %ld full page images, %llu bytes"),
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 7295cf0215..0864cda80e 100644
--- 

Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Joel Jacobson
On Mon, Mar 8, 2021, at 19:46, Joel Jacobson wrote:
> However, for certain tasks, when a high-level language is preferred,
> and when the raw performance of C isn't necessary, then maybe SQL/PLpgSQL
> could be a serious alternative to Perl? 

Before we had jsonb, this would have been totally unrealistic.

But with jsonb, I think we actually have complete coverage of Perl's data types:

Perl array <=> jsonb array
Perl hash <=> jsonb object
Perl scalar <=> jsonb string/boolean/number

I've been using jsonb with great success for code generation.
ASTs are nicely represented as nested jsonb arrays.

/Joel



Re: Why isn't pg_stat_get_subscription() marked as proretset?

2021-03-08 Thread Tom Lane
I wrote:
> The code in pg_stat_get_subscription() appears to believe that it
> can return a set of rows, but its pg_proc entry does not have
> proretset set.  It may be that this somehow accidentally fails
> to malfunction when the function is used via the system views,
> but if you try to call it directly it falls over:
> regression=# select pg_stat_get_subscription(0);
> ERROR:  set-valued function called in context that cannot accept a set

Indeed, the reason we have not noticed this mistake is that
nodeFunctionscan.c and execSRF.c do not complain if a function-in-FROM
returns a tuplestore without having been marked as proretset.
That seems like a bad idea: it only accidentally works at all,
I think, and we might break such cases unknowingly via future code
rearrangement in that area.  There are also bad consequences
elsewhere, such as that the planner mistakenly expects the function
to return just one row, which could result in poor plan choices.

So I think we should not just fix the bogus pg_proc marking, but
also change the executor to complain if a non-SRF tries to return
a set.  I propose the attached.

(I initially had it complain if a non-SRF returns returnMode ==
SFRM_ValuePerCall and isDone == ExprEndResult, but it turns out that
plperl sometimes does that as a way of returning NULL.  I'm not
sufficiently excited about this to go change that, so the patch lets
that case pass.)

regards, tom lane

diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index 8aec3b549b..545b6c19da 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -353,11 +353,21 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 			 */
 			if (rsinfo.isDone != ExprMultipleResult)
 break;
+
+			/*
+			 * Check that set-returning functions were properly declared.
+			 * (Note: for historical reasons, we don't complain if a non-SRF
+			 * returns ExprEndResult; that's treated as returning NULL.)
+			 */
+			if (!returnsSet)
+ereport(ERROR,
+		(errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
+		 errmsg("table-function protocol for value-per-call mode was not followed")));
 		}
 		else if (rsinfo.returnMode == SFRM_Materialize)
 		{
 			/* check we're on the same page as the function author */
-			if (!first_time || rsinfo.isDone != ExprSingleResult)
+			if (!first_time || rsinfo.isDone != ExprSingleResult || !returnsSet)
 ereport(ERROR,
 		(errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
 		 errmsg("table-function protocol for materialize mode was not followed")));
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 506689d8ac..a0fe0851c4 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5287,8 +5287,9 @@
   proargnames => '{slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,stats_reset}',
   prosrc => 'pg_stat_get_replication_slots' },
 { oid => '6118', descr => 'statistics: information about subscription',
-  proname => 'pg_stat_get_subscription', proisstrict => 'f', provolatile => 's',
-  proparallel => 'r', prorettype => 'record', proargtypes => 'oid',
+  proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
+  proretset => 't', provolatile => 's', proparallel => 'r',
+  prorettype => 'record', proargtypes => 'oid',
   proallargtypes => '{oid,oid,oid,int4,pg_lsn,timestamptz,timestamptz,pg_lsn,timestamptz}',
   proargmodes => '{i,o,o,o,o,o,o,o,o}',
   proargnames => '{subid,subid,relid,pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}',


Re: New IndexAM API controlling index vacuum strategies

2021-03-08 Thread Robert Haas
On Mon, Feb 1, 2021 at 10:17 PM Peter Geoghegan  wrote:
> * No need to change MaxHeapTuplesPerPage for now, since that only
> really makes sense in cases that heavily involve bottom-up deletion,
> where we care about the *concentration* of LP_DEAD line pointers in
> heap pages (and not just the absolute number in the entire table),
> which is qualitative, not quantitative (somewhat like bottom-up
> deletion).
>
> The change to MaxHeapTuplesPerPage that Masahiko has proposed does
> make sense -- there are good reasons to increase it. Of course there
> are also good reasons to not do so. I'm concerned that we won't have
> time to think through all the possible consequences.

Yes, I agree that it's good to postpone this to a future release, and
that thinking through the consequences is not so easy. One possible
consequence that I'm concerned about is sequential scan performance.
For an index scan, you just jump to the line pointer you want and then
go get the tuple, but a sequential scan has to loop over all the line
pointers on the page, and skipping a lot of dead ones can't be
completely free. A small increase in MaxHeapTuplesPerPage probably
wouldn't matter, but the proposed increase of almost 10x (291 -> 2042)
is a bit scary. It's also a little hard to believe that letting almost
50% of the total space on the page get chewed up by the line pointer
array is going to be optimal. If that happens to every page while the
amount of data stays the same, the table must almost double in size.
That's got to be bad. The whole thing would be more appealing if there
were some way to exert exponentially increasing back-pressure on the
length of the line pointer array - that is, make it so that the longer
the array is already, the less willing we are to extend it further.
But I don't really see how to do that.

Also, at the risk of going on and on, line pointer array bloat is very
hard to eliminate once it happens. We never even try to shrink the
line pointer array, and if the last TID in the array is still in use,
it wouldn't be possible anyway, assuming the table has at least one
non-BRIN index. Index page splits are likewise irreversible, but
creating a new index and dropping the old one is still less awful than
having to rewrite the table.

Another thing to consider is that MaxHeapTuplesPerPage is used to size
some stack-allocated arrays, especially the stack-allocated
PruneState. I thought for a while about this and I can't really see
why it would be a big problem, even with a large increase in
MaxHeapTuplesPerPage, so I'm just mentioning this in case it makes
somebody else think of something I've missed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: SQL/JSON: functions

2021-03-08 Thread Ibrar Ahmed
On Sat, Jan 23, 2021 at 3:37 PM Erik Rijkers  wrote:

> On 2021-01-20 03:49, Nikita Glukhov wrote:
>
> > [0001-Add-common-SQL-JSON-clauses-v52.patch.gz]
> > [0002-SQL-JSON-constructors-v52.patch.gz]
> > [0003-IS-JSON-predicate-v52.patch.gz]
> > [0004-SQL-JSON-query-functions-v52.patch.gz]
> > [0005-SQL-JSON-functions-for-json-type-v52.patch.gz]
> > [0006-GUC-sql_json-v52.patch.gz]
>
> Hi,
>
> I read through the file func.sgml (only that file) and put the
> errors/peculiarities in the attached diff.  (Small stuff; typos really)
>
>
> Your patch includes a CREATE TABLE my_films + INSERT, to run the
> examples against.  I think this is a great idea and we should do it more
> often.
>
> But, the table has a text-column to contain the subsequently inserted
> json values. The insert runs fine but it turns out that some later
> examples queries only run against a jsonb column.  So I propose to
> change:
>CREATE TABLE my_films (js text);
> to:
>CREATE TABLE my_films (js jsonb);
>
> This change is not yet included in the attached file.  An alternative
> would be to cast the text-column in the example queries as js::jsonb
>
>
> I also noticed that some errors were different in the sgml file than 'in
> the event':
>
>
> SELECT JSON_QUERY(js, '$.favorites[*].kind' ERROR ON ERROR) FROM
> my_films_jsonb;
> (table 'my_films_jsonb' is the same as your 'my_films', but with js
> as a jsonb column)
>
> manual says: "ERROR: more than one SQL/JSON item"
>   in reality: "ERROR: JSON path expression in JSON_QUERY should return
> singleton item without wrapper"
>  and:   "HINT: use WITH WRAPPER clause to wrap SQL/JSON item
> sequence into array"
>
>
> Thanks,
>
> Erik Rijkers
>
> >
> > --
> > Nikita Glukhov
> > Postgres Professional: http://www.postgrespro.com
> > The Russian Postgres Company
>

The patch (func.sgml.20210123.diff) does not apply successfully.

http://cfbot.cputube.org/patch_32_2901.log



=== Applying patches on top of PostgreSQL commit ID
0ce4cd04da558178b0186057b721c50a00b7a945 ===
=== applying patch ./func.sgml.20210123.diff
patching file doc/src/sgml/func.sgml
Hunk #1 FAILED at 16968.
Hunk #2 FAILED at 17034.
...

Hunk #19 FAILED at 18743.
19 out of 19 hunks FAILED -- saving rejects to file doc/src/sgml/func.sgml.rej



Can we get a rebase?

I am marking the patch "Waiting on Author".

-- 
Ibrar Ahmed


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Joel Jacobson
On Mon, Mar 8, 2021, at 18:30, Tom Lane wrote:
> FWIW, I personally think that returning a start position and a length
> would be the most understandable way to operate. 

Very good point. I agree. (And then ranges cannot be used, regardless of 
canonical form.)

> Yeah: it's hard.  The amount of catalog infrastructure needed by a
> composite type is dauntingly large, and genbki.pl doesn't offer any
> support for building composite types that aren't tied to catalogs.
> (I suppose if you don't mind hacking Perl, you could try to refactor
> it to improve that.) 

I haven't studied genbki.pl in detail, but seen its name on the list many times,
maybe I should go through it to understand it in detail.

On the topic of Perl.
I've written a lot of Perl code over the years.
Trustly was initially a Perl+PostgreSQL microservice project, with different 
components
written in Perl run as daemons, communicating with each other over TCP/IP,
via JSON-RPC. We had lots of strange problems difficult to debug.
In the end, we moved all the business logics from Perl into database functions 
in PostgreSQL,
and all problems went away. The biggest win was the nice UTF-8 support,
which was really awkward in Perl. It's kind of UTF-8, but not really and not 
always.

Most programming languages/compilers are obsessed
with the concept of "bootstrapping"/"dogfooding".

Thinking of PostgreSQL as a language/compiler, that would mean we should be 
obsessed with the idea
of implementing PostgreSQL in SQL or PL/pgSQL. That would be quite a challenge 
of course.

However, for certain tasks, when a high-level language is preferred,
and when the raw performance of C isn't necessary, then maybe SQL/PLpgSQL
could be a serious alternative to Perl? 

If I understand it correctly, we don't need to run genbki.pl to compile 
PostgreSQL,
so someone wanting to compile PostgreSQL without having a running 
PostgreSQL-instance
could do so without problems.

A dependency on having a PostgreSQL instance running,
is perhaps acceptable for hackers developing PostgreSQL?
But of course not for normal users just wanting to compile PostgreSQL.

If we think there is at least a 1% chance this is a feasible idea,
I'm willing to try implementing a SQL/PLpgSQL-version of genbki.pl.
Would be a fun hack, but not if it's guaranteed time-waste.

>  It seems like you need it to return setof array(s), so the choices are
> array of composite, 2-D array, or two parallel arrays.  I'm not sure
> the first of those is so much better than the others that it's worth
> the pain involved to set up the initial catalog data that way.

I agree, I like the 2-D array version, but only if a we could provide a 
C-function
to allow unnesting N+1 dims to N dims. Is that a fruitful idea, or are there
reasons why it cannot be done easily? I could give it a try, if we think it's a 
good idea.

> 
> BTW, I don't know if you know the history here, but regexp_matches()
> is way older than regexp_match(); we eventually invented the latter
> because the former was just too hard to use for easy non-'g' cases.
> I'm inclined to think we should learn from that and provide equivalent
> variants regexp_position[s] right off the bat.

I remember! regexp_match() was a very welcomed addition.
I agree both regexp_position[s] variants would be good for same reasons.

/Joel

Re: partial heap only tuples

2021-03-08 Thread Bossart, Nathan
On 3/8/21, 10:16 AM, "Ibrar Ahmed"  wrote:
> On Wed, Feb 24, 2021 at 3:22 AM Bossart, Nathan  wrote:
>> On 2/10/21, 2:43 PM, "Bruce Momjian"  wrote:
>>> I wonder if you should create a Postgres wiki page to document all of
>>> this.  I agree PG 15 makes sense.  I would like to help with this if I
>>> can.  I will need to study this email more later.
>>
>> I've started the wiki page for this:
>>
>>https://wiki.postgresql.org/wiki/Partial_Heap_Only_Tuples
>>
>> Nathan
>
> The regression test case  (partial-index) is failing 
>
> https://cirrus-ci.com/task/5310522716323840 

This patch is intended as a proof-of-concept of some basic pieces of
the project.  I'm working on a new patch set that should be more
suitable for community review.

Nathan



Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Mark Dilger



> On Mar 8, 2021, at 9:20 AM, Joel Jacobson  wrote:
> 
> On Mon, Mar 8, 2021, at 18:11, Mark Dilger wrote:
>> > On Mar 8, 2021, at 9:05 AM, Joel Jacobson  wrote:
>> > 
>> > If a N+1 dimension array could easily be unnested to a N dimension array,
>> > I would prefer Tom's idea of a 2-D regexp_positions(), since it simple and 
>> > not controversial.
>> 
>> How about proposing some array functions to go along with the 
>> regexp_positions, and then do it that way?
> 
> Sounds like a nice solution. That would be a huge win when dealing with 
> multidimensional arrays in general.
> 
> Do we have strong support on the list for such a function? If so, I can make 
> an attempt implementing it, unless some more experienced hacker wants to do 
> it.

That's a hard question to answer in advance.  Typically, you need to propose a 
solution, and then get feedback.  You wouldn't need to post a patch, but 
perhaps some examples of how you would expect it to work, like

+SELECT slice('{{1,2,3,4},{5,6,7,8},{9,10,11,12},{13,14,15,16}}'::integer[][], 
'[2,4)'::int4range);
+   slice   
+---
+ {{2,3}}
+ {{6,7}}
+ {{10,11}}
+ {{14,15}}
+(4 rows)
+
+SELECT 
slice('{{{1,2,3},{4,5,6},{7,8,9}},{{10,11,12},{13,14,15},{16,17,18}}}'::integer[][][],
 '[2,4)'::int4range);
+   slice   
+---
+ {{{4,5,6},{7,8,9}}}
+ {{{13,14,15},{16,17,18}}}
+(2 rows)
+

and then people can tell you why they hate that choice of interface.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Chapman Flack
On 03/08/21 13:29, Chapman Flack wrote:
> I think the s-free version is exactly the regexp_instr included in
> the other concurrent proposal [1]

sorry.

[1]
https://www.postgresql.org/message-id/fc160ee0-c843-b024-29bb-97b5da61971f%40darold.net




Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Chapman Flack
On 03/08/21 12:30, Tom Lane wrote:
> I'm inclined to think we should learn from that and provide equivalent
> variants regexp_position[s] right off the bat.

I think the s-free version is exactly the regexp_instr included in
the other concurrent proposal [1], which closely corresponds to the
ISO position_regex() except for the ISO one using XQuery regex syntax.

I gather from [1] that the name regexp_instr is chosen in solidarity
with other DBMSs that de facto have it. Would it be weirder to have the
singular form be regexp_instr and the plural be regexp_positions?
Or to diverge from the other systems' de facto convention and name
the singular form regexp_position? (Or the plural form regexp_instrs?
That sounds to me like a disassembler for regexps. Or regexps_instr,
like attorneys general? Never mind.)

Regards,
-Chap




Re: partial heap only tuples

2021-03-08 Thread Ibrar Ahmed
On Wed, Feb 24, 2021 at 3:22 AM Bossart, Nathan  wrote:

> On 2/10/21, 2:43 PM, "Bruce Momjian"  wrote:
> > I wonder if you should create a Postgres wiki page to document all of
> > this.  I agree PG 15 makes sense.  I would like to help with this if I
> > can.  I will need to study this email more later.
>
> I've started the wiki page for this:
>
> https://wiki.postgresql.org/wiki/Partial_Heap_Only_Tuples
>
> Nathan
>
>
The regression test case  (partial-index) is failing

https://cirrus-ci.com/task/5310522716323840


=== ./src/test/isolation/output_iso/regression.diffs ===
diff -U3 /tmp/cirrus-ci-build/src/test/isolation/expected/partial-index.out
/tmp/cirrus-ci-build/src/test/isolation/output_iso/results/partial-index.out
--- /tmp/cirrus-ci-build/src/test/isolation/expected/partial-index.out
2021-03-06 23:11:08.018868833 +
+++
/tmp/cirrus-ci-build/src/test/isolation/output_iso/results/partial-index.out
2021-03-06 23:26:15.857027075 +
@@ -30,6 +30,8 @@
6 a 1
7 a 1
8 a 1
+9 a 2
+10 a 2
step c2: COMMIT;
starting permutation: rxy1 wx1 wy2 c1 rxy2 c2
@@ -83,6 +85,7 @@
6 a 1
7 a 1
8 a 1
+9 a 2
10 a 1
step c1: COMMIT;


Can you please take a look at that?

-- 
Ibrar Ahmed


Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

2021-03-08 Thread Jacob Champion
On Wed, 2021-03-03 at 15:30 +0900, Michael Paquier wrote:
> Extra eyes are welcome here, though I feel comfortable with the
> approach taken here.

I have one suggestion for the new logic:

>else
>{
>/*
> * In the non-SSL case, just remove the crypto callbacks.  
> This code
> * path has no dependency on any pending SSL calls.
> */
>destroy_needed = true;
>}
>[...]
>if (destroy_needed && conn->crypto_loaded)
>{
>destroy_ssl_system();
>conn->crypto_loaded = false;
>}

I had to convince myself that this logic is correct -- we set
destroy_needed even if crypto is not enabled, but then check later to
make sure that crypto_loaded is true before doing anything. What would
you think about moving the conn->crypto_loaded check to the else
branch, so that destroy_needed is only set if we actually need it?

Either way, the patch looks good to me and behaves nicely in testing.
Thanks!

--Jacob


Re: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Tom Lane
"Joel Jacobson"  writes:
> I prefer to think of a match as two points. If the points are at the same 
> position, it's a zero length match.

FWIW, I personally think that returning a start position and a length
would be the most understandable way to operate.  If you report start
position and end position then there is always going to be confusion
over whether the end position is inclusive or exclusive (that is,
some code including our regex library thinks of the "end" as being
"first character after the match").  This is indeed the same
definitional issue you're contending with vis-a-vis range endpoints,
only now you lack any pre-existing definition that people might rely on
to know what you meant.

> Since there are currently zero composite type returning catalog functions, I 
> can see why the idea of returning a "range" with two "start" and "stop" 
> fields is controversial. There are probably good reasons that I fail to see 
> why there are no composite type returning functions in the catalogs. Ideas on 
> why this is the case, anyone?

Yeah: it's hard.  The amount of catalog infrastructure needed by a
composite type is dauntingly large, and genbki.pl doesn't offer any
support for building composite types that aren't tied to catalogs.
(I suppose if you don't mind hacking Perl, you could try to refactor
it to improve that.)  Up to now we've avoided the need for that,
since a function can be declared to return an anonymous record type
by giving it some OUT parameters.  However, if I'm understanding
things correctly "regexp_positions(IN ..., OUT match_start integer,
OUT match_length integer) RETURNS SETOF record" wouldn't be enough
for you, because you really need a 2-D tableau of match data to
handle the case of multiple capturing parens plus 'g' mode.  It
seems like you need it to return setof array(s), so the choices are
array of composite, 2-D array, or two parallel arrays.  I'm not sure
the first of those is so much better than the others that it's worth
the pain involved to set up the initial catalog data that way.

BTW, I don't know if you know the history here, but regexp_matches()
is way older than regexp_match(); we eventually invented the latter
because the former was just too hard to use for easy non-'g' cases.
I'm inclined to think we should learn from that and provide equivalent
variants regexp_position[s] right off the bat.

regards, tom lane




Re: [HACKERS] Custom compression methods

2021-03-08 Thread Justin Pryzby
On Sun, Mar 07, 2021 at 06:04:41PM +0530, Dilip Kumar wrote:
> On Sun, Mar 7, 2021 at 2:19 PM Justin Pryzby  wrote:
> >
> > Earlier in this thread, I suggested to implement an option to pg_restore to
> > avoid outputting compression, in order to allow restoring with a different
> > compression (by using the default_toast_compression GUC).  Now, it seems 
> > like
> > that's even more important, to allow restoring into binaries --without-lz4.
> > (the pg_dump isn't in LZ4 format, it just needs to not say "COMPRESSION 
> > LZ4").
> 
> IMHO, we have an option with pg_dump that should be sufficient, no?

I think it's insufficient since people may be unable to restore from backup, or
can only restore backup by resorting to this:
pg_restore -f- |sed 's/COMPRESSION lz4//' |psql -d mydb

I think there's a parallel with --no-tablespaces.  But if a tablespace is
missing/renamed, the table is still restored (with errors during SET
default_tablespace), whereas if lz4 is missing, the table is not restored.
Or actually the table would be created, but the COPY/INSERTs would fail.

There's an argument to be made that this is already an issue - for example,
I've numerous times done a partial restore of a single partition, where the
column types have changed in the parent, and I need to use sed to restore the
partition.  However, that's improving - in v14: "attach table" is a separate
pg_dump object, so the table *is* restored, and only the ATTACH command fails.
(See 9a4c0e36f).

I wonder if COMPRESSION should be dumped as ALTER statements, not in the
CREATE.  In fact, the CREATE syntax is optional and could be removed.  Similar
to ALTER TABLE t ALTER c SET STATISTICS 99 - there's no CREATE grammar for
that.

Note that I think that using ALTER doesn't resolves this issue, since the
createStmt is sent using the simple query protocol (PQexec), which means that
all its commands are executed as a single transaction, and if the ALTER to LZ4
fails, so does the preceding CREATE.  This is the same issue I see with
"CREATE..ATTACH PARTITION", above.

> but I agree that having such an option with restore will give more
> flexibility basically, by using the same dump we can restore to binary
> --with-lz4 as well as without-lz4 if such option exists with restore
> as well.  But it seems in pg_restore we process token by token so if
> we want to implement such an option then I think we will have to parse
> the complete string of CREATE TABLE command and remove the compression
> option if it exists for any attribute. I am not sure whether providing
> this option is worth the complexity?

Oh...I realize now that this is different from the "tablespace" case in that
the column compression is not stored separately in the dump.  And because it
exists for each column, not for the whole table.  I suppose one answer to that
would be to make compression a per-table reloption, rather than a per-attribute
option.  (I can anticipate that you'll hate this idea.)

-- 
Justin




Re: TRUNCATE on foreign table

2021-03-08 Thread Ibrar Ahmed
On Thu, Feb 11, 2021 at 6:23 PM Ashutosh Bapat 
wrote:

> On Wed, Feb 10, 2021 at 10:58 PM Kazutaka Onishi 
> wrote:
> >
> > That's because using the foreign server is difficult for the user.
> >
> > For example, the user doesn't always have the permission to login to the
> forein server.
> > In some cases, the foreign table has been created by the administrator
> that has permission to access the two servers and the user only uses the
> local server.
> > Then the user has to ask the administrator to run TRUNCATE every time.
>
> That might actually be seen as a loophole but ...
>
> >
> > Furthermore,there are some fdw extensions which don't support SQL.
> mongo_fdw, redis_fdw, etc...
> > These extensions have been used to provide SQL interfaces to the users.
> > It's hard for the user to run TRUNCATE after learning each database.
>
> this has some appeal.
>
> Thanks for sharing the usecases.
> --
> Best Wishes,
> Ashutosh Bapat
>
>
> The patch (pgsql14-truncate-on-foreign-table.v2.patch) does not apply
successfully.

http://cfbot.cputube.org/patch_32_2972.log

patching file contrib/postgres_fdw/expected/postgres_fdw.out
Hunk #2 FAILED at 9179.
1 out of 2 hunks FAILED -- saving rejects to file
contrib/postgres_fdw/expected/postgres_fdw.out.rej


As this is a minor change therefore I have updated the patch. Please take a
look.

--
Ibrar Ahmed


pgsql14-truncate-on-foreign-table.v3.patch
Description: Binary data


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Joel Jacobson
On Mon, Mar 8, 2021, at 18:11, Mark Dilger wrote:
> > On Mar 8, 2021, at 9:05 AM, Joel Jacobson  wrote:
> > 
> > If a N+1 dimension array could easily be unnested to a N dimension array,
> > I would prefer Tom's idea of a 2-D regexp_positions(), since it simple and 
> > not controversial.
> 
> How about proposing some array functions to go along with the 
> regexp_positions, and then do it that way?

Sounds like a nice solution. That would be a huge win when dealing with 
multidimensional arrays in general.

Do we have strong support on the list for such a function? If so, I can make an 
attempt implementing it, unless some more experienced hacker wants to do it.

/Joel


Re: pg_upgrade failing for 200+ million Large Objects

2021-03-08 Thread Magnus Hagander
On Mon, Mar 8, 2021 at 5:58 PM Tom Lane  wrote:
>
> Magnus Hagander  writes:
> > On Mon, Mar 8, 2021 at 5:33 PM Tom Lane  wrote:
> >> It does seem that --single-transaction is a better idea than fiddling with
> >> the transaction wraparound parameters, since the latter is just going to
> >> put off the onset of trouble.  However, we'd have to do something about
> >> the lock consumption.  Would it be sane to have the backend not bother to
> >> take any locks in binary-upgrade mode?
>
> > I believe the problem occurs when writing them rather than when
> > reading them, and I don't think we have a binary upgrade mode there.
>
> You're confusing pg_dump's --binary-upgrade switch (indeed applied on
> the dumping side) with the backend's -b switch (IsBinaryUpgrade,
> applied on the restoring side).

Ah. Yes, I am.


> > We could invent one of course. Another option might be to exclusively
> > lock pg_largeobject, and just say that if you do that, we don't have
> > to lock the individual objects (ever)?
>
> What was in the back of my mind is that we've sometimes seen complaints
> about too many locks needed to dump or restore a database with $MANY
> tables; so the large-object case seems like just a special case.

It is -- but I guess it's more likely to have 100M large objects than
to have 100M tables. (and the cutoff point comes a lot earlier than
100M). But the fundamental onei s the same.


> The answer up to now has been "raise max_locks_per_transaction enough
> so you don't see the failure".  Having now consumed a little more
> caffeine, I remember that that works in pg_upgrade scenarios too,
> since the user can fiddle with the target cluster's postgresql.conf
> before starting pg_upgrade.
>
> So it seems like the path of least resistance is
>
> (a) make pg_upgrade use --single-transaction when calling pg_restore
>
> (b) document (better) how to get around too-many-locks failures.

Agreed. Certainly seems like a better path forward than arbitrarily
pushing the limit on number of transactions which just postpones the
problem.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: [PATCH] pg_permissions

2021-03-08 Thread Joel Jacobson
On Mon, Mar 8, 2021, at 15:35, Joe Conway wrote:
> While this is interesting and probably useful for troubleshooting, it does not
> provide the complete picture if what you care about is something like "what
> stuff can joel do in my database".

Good point, I agree.

I think that's a different more complicated use-case though.

Personally, I use these views to resolve errors like this:

$ dropuser test
dropuser: error: removal of role "test" failed: ERROR:  role "test" cannot be 
dropped because some objects depend on it
DETAIL:  1 object in database joel

Hmmm. I wonder which 1 object that could be?

$ psql
# SELECT * FROM pg_ownerships WHERE rolname = 'test';
regclass | obj_desc | rolname
--+--+-
pg_class | table t  | test
pg_type  | type t   | test
pg_type  | type t[] | test
(3 rows)

It could also be due to permissions, so normally I would check both 
pg_ownership and pg_permissions at the same time,
since otherwise I could possibly get the same error again:

$ dropuser test
dropuser: error: removal of role "test" failed: ERROR:  role "test" cannot be 
dropped because some objects depend on it
DETAIL:  1 object in database joel

# SELECT * FROM pg_permissions WHERE grantee = 'test';
regclass | obj_desc | grantor | grantee | privilege_type | is_grantable
--+--+-+-++--
pg_class | table t  | joel| test| INSERT | f
(1 row)

Now, this situation is probably easiest to quickly resolve using REASSIGN OWNED 
BY ... TO ...
but I think that command is scary, I would rather prefer to resolve it manually
to not blindly cause problems.

/Joel

Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Mark Dilger



> On Mar 8, 2021, at 9:05 AM, Joel Jacobson  wrote:
> 
> If a N+1 dimension array could easily be unnested to a N dimension array,
> I would prefer Tom's idea of a 2-D regexp_positions(), since it simple and 
> not controversial.

How about proposing some array functions to go along with the regexp_positions, 
and then do it that way?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Yet another fast GiST build

2021-03-08 Thread Andrey Borodin
Thanks, Ibrar!

> 8 марта 2021 г., в 21:15, Ibrar Ahmed  написал(а):
> 
> 
> 
> On Mon, Mar 8, 2021 at 8:59 PM Peter Geoghegan  wrote:
> On Mon, Mar 8, 2021 at 6:41 AM Ibrar Ahmed  wrote:
> > The patch (0001-Add-bool-column-for-LP_DEAF-flag-to-GiST-pageinspect.patch)
> > does not apply successfully and has multiple hanks failed.
> 
> That's because it was committed.
> 
> Thanks for the clarification, its status was not changed which confused me :)
> 

There were numerous GiST-build-related patches in this thread. Yet uncommitted 
is a patch with sortsupport routines for btree_gist contrib module.
Here's its version which needs review.

Thanks for bringing this up!

Best regards, Andrey Borodin.



v5-0001-Sortsupport-for-sorting-GiST-build-for-gist_btree.patch
Description: Binary data


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Joel Jacobson
On Mon, Mar 8, 2021, at 17:20, Mark Dilger wrote:
> > On Mar 5, 2021, at 11:46 AM, Joel Jacobson  wrote:
> > <0003-regexp-positions.patch>
> 
> I did a bit more testing:
> 
> +SELECT regexp_positions('foobarbequebaz', 'b', 'g');
> + regexp_positions 
> +--
> + {"[3,5)"}
> + {"[6,8)"}
> + {"[11,13)"}
> +(3 rows)
> +
> 
> I understand that these ranges are intended to be read as one character long 
> matches starting at positions 3, 6, and 11, but they look like they match 
> either two or three characters, depending on how you read them, and users 
> will likely be confused by that.
> 
> +SELECT regexp_positions('foobarbequebaz', '(?=beque)', 'g');
> + regexp_positions 
> +--
> + {"[6,7)"}
> +(1 row)
> +
> 
> This is a zero length match.  As above, it might be confusing that a zero 
> length match reads this way.
> 
> +SELECT regexp_positions('foobarbequebaz', '(?<=z)', 'g');
> + regexp_positions 
> +--
> + {"[14,15)"}
> +(1 row)
> +
> 
> Same here, except this time position 15 is referenced, which is beyond the 
> end of the string.
> 
> I think a zero length match at the end of this string should read as 
> {"[14,14)"}, and you have been forced to add one to avoid that collapsing 
> down to "empty", but I'd rather you found a different datatype rather than 
> abuse the definition of int4range.
> 
> It seems that you may have reached a similar conclusion down-thread?

This is due to the, in my opinion, unfortunate decision of using 
inclusive/exclusive as the canonical form for discrete types.
Probably not much we can do about that, but that's what we have, so I think 
it's fine.

[6,7) is exactly the same thing as [6,6] for discrete types, it simply means 
the startpos and endpos both are 6.

I prefer to think of a match as two points. If the points are at the same 
position, it's a zero length match.

In the example, the startpos and endpos are both at 6, so it's a zero length 
match.,

This was very confusing to me at first. I wrongly thought I needed an empty 
int4range and had the perverse idea of hacking the range type to allow setting 
lower and upper even though it was empty. This was a really really bad idea 
which I feel stupid of even considering. It was before I understood a zero 
length match should actually *not* be represented as an empty int4range, but as 
an int4range covering exactly one single integer, since startpos=endpos. This 
was due to my off-by-one error. With that fixed, the only problem is the (in my 
opinion) unnatural canonical form for discrete types, since in this context 
it's just silly to talk about inclusive/exclusive. I think inclusive/inclusive 
would have been much more SQL idiomatic, since that's the semantics for BETWEEN 
in SQL, it's inclusive/inclusive. So is most other programming environments 
I've seen.

However, not much we can do about that for int4range/int8range,
but maybe multirange could change the canonical form going forward.

Even if not changed, I think int4range works just fine. It just requires a bit 
more mental effort to understand what the values mean. Probably an annoyance 
for users at first, but I think they easily will understand they should just do 
"-1" on the upper() value (but only if upper_inc() is FALSE, but you know that 
for sure for int4ranges, so it is really necessary, one might wonder).

If a N+1 dimension array could easily be unnested to a N dimension array,
I would prefer Tom's idea of a 2-D regexp_positions(), since it simple and not 
controversial.

Since there are currently zero composite type returning catalog functions, I 
can see why the idea of returning a "range" with two "start" and "stop" fields 
is controversial. There are probably good reasons that I fail to see why there 
are no composite type returning functions in the catalogs. Ideas on why this is 
the case, anyone?

/Joel


Re: pg_stat_statements oddity with track = all

2021-03-08 Thread Magnus Hagander
On Sun, Mar 7, 2021 at 8:39 AM Julien Rouhaud  wrote:
>
> On Sat, Mar 06, 2021 at 06:56:49PM +0100, Magnus Hagander wrote:
> > On Sun, Dec 27, 2020 at 9:39 AM Julien Rouhaud  wrote:
> > >
> > - *
> > - * Right now, this structure contains no padding.  If you add any, make 
> > sure
> > - * to teach pgss_store() to zero the padding bytes.  Otherwise, things will
> > - * break, because pgss_hash is created using HASH_BLOBS, and thus tag_hash
> > - * is used to hash this.
> >
> > I don't think removing this comment completely is a good idea. Right
> > now it ends up there, yes, but eventually it might reach the same
> > state again. I think it's better to reword it based on the current
> > situation while keeping the part about it having to be zeroed for
> > padding. And maybe along with a comment at the actual memset() sites
> > as well?
>
> Agreed, I'll take care of that.
>
> > AFAICT, it's going to store two copies of the query in the query text
> > file (pgss_query_texts.stat)?  Can we find a way around that? Maybe by
> > looking up the entry with the flag set to the other value, and then
> > reusing that?
>
> Yes I was a bit worried about the possible extra text entry.  I kept things
> simple until now as the general opinion was that entries existing for both top
> level and nested level should be very rare so adding extra code and overhead 
> to
> spare a few query texts wouldn't be worth it.
>
> I think that using a flag might be a bit expensive, as we would have to make
> sure that at least one of the possible two entries has it.  So if there are 2
> entries and the one with the flag is evicted, we would have to transfer the
> flag to the other one, and check the existence of the flag when allocatin a 
> new
> entry.  And all of that has to be done holding an exclusive lock on 
> pgss->lock.

Yeah, we'd certainly want to minimize things. But what if they both
have the flag at that point? Then we wouldn't have to care on
eviction? But yes, for new allications we'd have to look up if the
query existed with the other value of the flag, and copy it over in
that case.

> Maybe having a new hash table (without the toplevel flag) for those query text
> might be better, or maybe pgss performance is already so terrible when you 
> have
> to regularly evict entries that it wouldn't make any real difference.
>
> Should I try to add some extra code to make sure that we only store the query
> text once, or should I document that there might be duplicate, but we expect
> that to be very rare?

If we expect it to be rare, I think it might be reasonable to just
document that. But do we really have a strong argument for it being
rare?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: pg_upgrade failing for 200+ million Large Objects

2021-03-08 Thread Tom Lane
Magnus Hagander  writes:
> On Mon, Mar 8, 2021 at 5:33 PM Tom Lane  wrote:
>> It does seem that --single-transaction is a better idea than fiddling with
>> the transaction wraparound parameters, since the latter is just going to
>> put off the onset of trouble.  However, we'd have to do something about
>> the lock consumption.  Would it be sane to have the backend not bother to
>> take any locks in binary-upgrade mode?

> I believe the problem occurs when writing them rather than when
> reading them, and I don't think we have a binary upgrade mode there.

You're confusing pg_dump's --binary-upgrade switch (indeed applied on
the dumping side) with the backend's -b switch (IsBinaryUpgrade,
applied on the restoring side).

> We could invent one of course. Another option might be to exclusively
> lock pg_largeobject, and just say that if you do that, we don't have
> to lock the individual objects (ever)?

What was in the back of my mind is that we've sometimes seen complaints
about too many locks needed to dump or restore a database with $MANY
tables; so the large-object case seems like just a special case.

The answer up to now has been "raise max_locks_per_transaction enough
so you don't see the failure".  Having now consumed a little more
caffeine, I remember that that works in pg_upgrade scenarios too,
since the user can fiddle with the target cluster's postgresql.conf
before starting pg_upgrade.

So it seems like the path of least resistance is

(a) make pg_upgrade use --single-transaction when calling pg_restore

(b) document (better) how to get around too-many-locks failures.

regards, tom lane




Re: proposal: psql –help reflecting service or URI usage

2021-03-08 Thread Mark Dilger



> On Mar 8, 2021, at 8:40 AM, Paul Förster  wrote:
> 
> Hi Mark,
> 
>> On 08. Mar, 2021, at 16:39, Mark Dilger  wrote:
>> 
>> Fortunately, the man pages and html docs are generated from the same 
>> sources.  Those sources are written in sgml, and the tools to build the docs 
>> must be installed.  From the top directory, execute `make docs` and if it 
>> complains about missing tools you will need to install them.  (The build 
>> target is 'docs', but the directory containing the docs is named 'doc'.)
> 
> so the help files I'd change would be doc/src/sgml/ref/psql-ref.sgml, 
> doc/src/sgml/ref/pg_isready.sgml, etc.?

Yes

>> Oh, I'm quite sorry to hear that.  The process of getting a patch accepted, 
>> especially the first time you submit one, can be discouraging.  But the 
>> community greatly benefits from new contributors joining the effort, so I'd 
>> much rather you not withdraw the idea.
> 
> I'd like to, and also I'd like to do all the bin/* tools (including wrapping 
> the long line in pg_isready ;-)), as you suggested, but I don't know the 
> process. In my first admittedly naive attempt, I just downloaded the source 
> from https://www.postgresql.org/ftp/source/v13.2, unpacked it and made my 
> changes there. Then I did a diff to the original and posted it here. I don't 
> even know if this is the correct workflow. I saw gitgub being mentioned a 
> couple of times but I don't have an account, nor do I even know how it works.
> 
> I was pretty surprised to see the lines in PWN:
> 
> "Paul Förster sent in a patch to mention database URIs in psql's --help 
> output."
> "Paul Förster sent in another revision of a patch to mention URIs and 
> services in psql --help's output."
> 
> Is there a FAQ somewhere that describes how properly create patches, submit 
> them and possibly get them released? Something like a step-by-step?
> 
> Is github a must-have here?

No, github is not a must-have.  I don't use a github account for my 
submissions.  The project uses git for source code control, but that's not the 
same thing as requiring "github".  The project switched from cvs to git a few 
years back.

If you can install git, using rpm or yum or whatever, then from the command 
line you can use

  git clone https://git.postgresql.org/git/postgresql.git

and it will create a working directory for you.  You can make modifications and 
commit them.  When you are finished, you can run

  git format-patch -v 1 master

and it will create a patch set containing all your changes relative to the 
public sources you cloned, and the patch will include your commit messages, 
which helps reviewers not only know what you changed, but why you made the 
changes, in your own words.

See https://wiki.postgresql.org/wiki/Development_information

>> If you need help with certain portions of the submission, such as editing 
>> the docs, I can help with that.
> 
> as you see above, I'm curious to learn, though doing it to all the tools will 
> take some time for me.
> 
> Sorry, I'm a noob, not so much to C, but to the workflows here. Hence my 
> questions may seem a little obvious to all the pros.

That's not a problem.  If this gets too verbose for the list, we can take this 
off-list and I can still walk you through it.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: shared-memory based stats collector

2021-03-08 Thread Ibrar Ahmed
On Fri, Mar 5, 2021 at 8:32 PM Fujii Masao 
wrote:

>
>
> On 2021/03/05 17:18, Kyotaro Horiguchi wrote:
> > At Thu, 21 Jan 2021 12:03:48 +0900 (JST), Kyotaro Horiguchi <
> horikyota@gmail.com> wrote in
> >> Commit 960869da08 (database statistics) conflicted with this. Rebased.
> >>
> >> I'm concerned about the behavior that pgstat_update_connstats calls
> >> GetCurrentTimestamp() every time stats update happens (with intervals
> >> of 10s-60s in this patch). But I didn't change that design since that
> >> happens with about 0.5s intervals in master and the rate is largely
> >> reduced in this patch, to make this patch simpler.
> >
> > I stepped on my foot, and another commit coflicted. Just rebased.
>
> Thanks for rebasing the patches!
>
> I think that 0003 patch is self-contained and useful, for example which
> enables us to monitor archiver process in pg_stat_activity. So IMO
> it's worth pusing 0003 patch firstly.
>
> Here are the review comments for 0003 patch.
>
> +   /* Archiver process's latch */
> +   Latch  *archiverLatch;
> +   /* Current shared estimate of appropriate spins_per_delay value */
>
> The last line in the above seems not necessary.
>
> In proc.h, NUM_AUXILIARY_PROCS needs to be incremented.
>
>  /* --
>   * Functions called from postmaster
>   * --
>   */
>  extern int pgarch_start(void);
>
> In pgarch.h, the above is not necessary.
>
> +extern void XLogArchiveWakeup(void);
>
> This seems no longer necessary.
>
> +extern void XLogArchiveWakeupStart(void);
> +extern void XLogArchiveWakeupEnd(void);
> +extern void XLogArchiveWakeup(void);
>
> These seem also no longer necessary.
>
> PgArchPID = 0;
> if (!EXIT_STATUS_0(exitstatus))
> -   LogChildExit(LOG, _("archiver process"),
> -pid, exitstatus);
> -   if (PgArchStartupAllowed())
> -   PgArchPID = pgarch_start();
> +   HandleChildCrash(pid, exitstatus,
> +
> _("archiver process"));
>
> I don't think that we should treat non-zero exit condition as a crash,
> as before. Otherwise when archive_command fails on a signal,
> archiver emits FATAL error and which leads the server restart.
>
> - * walwriter, autovacuum, or background worker.
> + * walwriter, autovacuum, archiver or background worker.
>*
>* The objectives here are to clean up our local state about the child
>* process, and to signal all other remaining children to quickdie.
> @@ -3609,6 +3606,18 @@ HandleChildCrash(int pid, int exitstatus, const
> char *procname)
> signal_child(AutoVacPID, (SendStop ? SIGSTOP : SIGQUIT));
> }
>
> +   /* Take care of the archiver too */
> +   if (pid == PgArchPID)
> +   PgArchPID = 0;
> +   else if (PgArchPID != 0 && take_action)
> +   {
> +   ereport(DEBUG2,
> +   (errmsg_internal("sending %s to process
> %d",
> +(SendStop
> ? "SIGSTOP" : "SIGQUIT"),
> +(int)
> PgArchPID)));
> +   signal_child(PgArchPID, (SendStop ? SIGSTOP : SIGQUIT));
> +   }
> +
>
> Same as above.
>
>
> In xlogarchive.c, "#include "storage/pmsignal.h"" is no longer necessary.
>
>
> pgarch_forkexec() should be removed from pgarch.c because it's no longer
> used.
>
>
>  /* 
>   * Public functions called from postmaster follow
>   * 
>   */
>
> The definition of PgArchiverMain() should be placed just
> after the above comment.
>
>
> exit(0) in PgArchiverMain() should be proc_exit(0)?
>
> Regards,
>
> --
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION
>
>
>
The code does not compile and has compilation warnings and errors.


--
pgstat.c:446:25: note: ‘cached_slrustats’ declared here
static PgStat_SLRUStats cached_slrustats;
^~~~
guc.c:4372:4: error: use of undeclared identifier 'pgstat_temp_directory';
did you mean 'pgstat_stat_directory'?
_temp_directory,
^
pgstat_stat_directory
../../../../src/include/pgstat.h:922:14: note: 'pgstat_stat_directory'
declared here
extern char *pgstat_stat_directory;
^
guc.c:4373:3: error: use of undeclared identifier 'PG_STAT_TMP_DIR'
PG_STAT_TMP_DIR,
^
guc.c:4374:25: error: use of undeclared identifier
'assign_pgstat_temp_directory'
check_canonical_path, assign_pgstat_temp_directory, NULL

---


Can we get an updated patch?

I am marking the patch "Waiting on Author"


-- 
Ibrar Ahmed


Re: [PATCH] New default role allowing to change per-role/database settings

2021-03-08 Thread Michael Banck
Hi,

Am Montag, den 08.03.2021, 20:54 +0500 schrieb Ibrar Ahmed:
> On Thu, Dec 31, 2020 at 6:16 PM Michael Banck  
> wrote:
> > in today's world, some DBAs have no superuser rights, but we can
> > delegate them additional powers like CREATEDB or the pg_monitor default
> > role etc. Usually, the DBA can also view the database logs, either via
> > shell access or some web interface.
> > 
> > One thing that I personally find lacking is that it is not possible to
> > change role-specific log settings (like log_statement = 'mod' for a
> > security sensitive role) without being SUPERUSER as their GUC context is
> > "superuser". This makes setup auditing much harder if there is no
> > SUPERUSER access, also pgaudit then only allows to configure object-
> > based auditing. Amazon RDS e.g. has patched Postgres to allow the
> > cluster owner/pseudo-superuser `rds_superuser' to change those log
> > settings that define what/when we log something, while keeping the
> > "where to log" entries locked down.
> > 
> > The attached patch introduces a new guc context "administrator" (better
> > names/bikeshedding for this welcome) that is meant as a middle ground
> > between "superuser" and "user". It also adds a new default role
> > "pg_change_role_settings" (better names also welcome) that can be
> > granted to DBAs so that they can change the "administrator"-context GUCs
> > on a per-role (or per-database) basis. Whether the latter should be
> > included is maybe debatable, but I added both on the basis that they are
> > the same "source".
> > 
> > The initial set of "administrator" GUCs are all current GUCs with
> > "superuser" context from these categories:
> > 
> > * Reporting and Logging / What to Log
> > * Reporting and Logging / When to
> > Log
> > * Statistics / Query and Index Statistics Collector
> > * Statistics /
> > Monitoring
> > 
> > Of course, further categories (or particular GUCs) could be added now or
> > in the future, e.g. RDS also patches the following GUCs in their v12
> > offering:
> > 
> > * temp_file_limit
> > * session_replication_role
> > 
> > RDS does *not* patch log_transaction_sample_rate from "Reporting and
> > Logging / When to Log", but that might be more of an oversight than a
> > security consideration, or does anybody see a big problem with that
> > (compared to the others in that set)?
> > 
> > I initially pondered not introducing a new context but just filtering on
> > category, but as categories seem to be only descriptive in guc.c and not
> > used for any policy decisions so far, I have abandoned this pretty
> > quickly.
> > 
> > 
> > Thoughts?
> > 
> > Michael
> 
> The patch (0001-Add-new-PGC_ADMINSET-guc-context-and-pg_change_role_.patch) 
> does
> not apply successfully and has some hunks failed.

Thanks for letting me know.

> http://cfbot.cputube.org/patch_32_2918.log
> 1 out of 23 hunks FAILED -- saving rejects to file 
> src/backend/utils/misc/guc.c.rej
> patching file src/include/catalog/catversion.h
> Hunk #1 FAILED at 53.
> 1 out of 1 hunk FAILED -- saving rejects to file 
> src/include/catalog/catversion.h.rej
> patching file src/include/catalog/pg_authid.dat
> Can we get a rebase? 

Please find attached the rebase; I've removed the catversion hunk as I
believe it is customary to leave that to committers.

> I am marking the patch "Waiting on Author".

I've put that back to "Needs Review".


Cheers,

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From d0eb8091d294ed1092eebcce30a3d67a13bdc84f Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Tue, 15 Dec 2020 20:53:59 +0100
Subject: [PATCH v2] Add new PGC_ADMINSET guc context and
 pg_change_role_settings default role.

This commit introduces `administrator' as a middle ground between `superuser'
and `user' for guc contexts.

Those settings initially include all previously `superuser'-context settings
from the 'Reporting and Logging' ('What/When to Log' but not 'Where to Log')
and both 'Statistics' categories. Further settings could be added in follow-up
commits.

Also, a new default role pg_change_role_settings is introduced.  This allows
administrators (that are not superusers) that have been granted this role to
change the above PGC_ADMINSET settings on a per-role/database basis like "ALTER
ROLE [...] SET log_statement".

The rationale is to allow certain settings that affect logging to be changed
in setups where the DBA has access to the database log, but is not a full
superuser.
---
 src/backend/commands/dbcommands.c |  7 +++-
 src/backend/commands/user.c   |  7 ++--
 src/backend/utils/misc/guc.c  | 56 +++
 

Re: ResourceOwner refactoring

2021-03-08 Thread Ibrar Ahmed
On Mon, Jan 25, 2021 at 10:15 PM Robert Haas  wrote:

> On Thu, Jan 21, 2021 at 5:14 AM Heikki Linnakangas 
> wrote:
> > Here you can see that as numsnaps increases, the test becomes slower,
> > but then it becomes faster again at 64-66, when it switches to the hash
> > table. So 64 seems too much. 32 seems to be the sweet spot here, that's
> > where scanning the hash and scanning the array are about the same speed.
>
> That sounds good. I mean, it could be that not all hardware behaves
> the same here. But trying to get it into the right ballpark makes
> sense.
>
> I also like the fact that this now has some cases where it wins by a
> significant margin. That's pretty cool; thanks for working on it!
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>
>
The patchset does not apply successfully, there are some hunk failures.

http://cfbot.cputube.org/patch_32_2834.log

v6-0002-Make-resowners-more-easily-extensible.patch

1 out of 6 hunks FAILED -- saving rejects to file
src/backend/utils/cache/plancache.c.rej
2 out of 15 hunks FAILED -- saving rejects to file
src/backend/utils/resowner/resowner.c.rej


Can we get a rebase?

I am marking the patch "Waiting on Author"

-- 
Ibrar Ahmed


Re: proposal: psql –help reflecting service or URI usage

2021-03-08 Thread Paul Förster
Hi Mark,

> On 08. Mar, 2021, at 16:39, Mark Dilger  wrote:
> 
> Fortunately, the man pages and html docs are generated from the same sources. 
>  Those sources are written in sgml, and the tools to build the docs must be 
> installed.  From the top directory, execute `make docs` and if it complains 
> about missing tools you will need to install them.  (The build target is 
> 'docs', but the directory containing the docs is named 'doc'.)

so the help files I'd change would be doc/src/sgml/ref/psql-ref.sgml, 
doc/src/sgml/ref/pg_isready.sgml, etc.?

> Oh, I'm quite sorry to hear that.  The process of getting a patch accepted, 
> especially the first time you submit one, can be discouraging.  But the 
> community greatly benefits from new contributors joining the effort, so I'd 
> much rather you not withdraw the idea.

I'd like to, and also I'd like to do all the bin/* tools (including wrapping 
the long line in pg_isready ;-)), as you suggested, but I don't know the 
process. In my first admittedly naive attempt, I just downloaded the source 
from https://www.postgresql.org/ftp/source/v13.2, unpacked it and made my 
changes there. Then I did a diff to the original and posted it here. I don't 
even know if this is the correct workflow. I saw gitgub being mentioned a 
couple of times but I don't have an account, nor do I even know how it works.

I was pretty surprised to see the lines in PWN:

"Paul Förster sent in a patch to mention database URIs in psql's --help output."
"Paul Förster sent in another revision of a patch to mention URIs and services 
in psql --help's output."

Is there a FAQ somewhere that describes how properly create patches, submit 
them and possibly get them released? Something like a step-by-step?

Is github a must-have here?

> If you need help with certain portions of the submission, such as editing the 
> docs, I can help with that.

as you see above, I'm curious to learn, though doing it to all the tools will 
take some time for me.

Sorry, I'm a noob, not so much to C, but to the workflows here. Hence my 
questions may seem a little obvious to all the pros.

Cheers,
Paul



Re: pg_upgrade failing for 200+ million Large Objects

2021-03-08 Thread Magnus Hagander
On Mon, Mar 8, 2021 at 5:33 PM Tom Lane  wrote:
>
> Robins Tharakan  writes:
> > On Mon, 8 Mar 2021 at 23:34, Magnus Hagander  wrote:
> >> Without looking, I would guess it's the schema reload using
> >> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
> >> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
> >> running all of those in a single transaction would be a better choice?
> >> One could argue it's still not a proper fix, because we'd still have a
> >> huge memory usage etc, but it would then only burn 1 xid instead of
> >> 500M...
>
> > (I hope I am not missing something but) When I tried to force pg_restore to
> > use a single transaction (by hacking pg_upgrade's pg_restore call to use
> > --single-transaction), it too failed owing to being unable to lock so many
> > objects in a single transaction.
>
> It does seem that --single-transaction is a better idea than fiddling with
> the transaction wraparound parameters, since the latter is just going to
> put off the onset of trouble.  However, we'd have to do something about
> the lock consumption.  Would it be sane to have the backend not bother to
> take any locks in binary-upgrade mode?

I believe the problem occurs when writing them rather than when
reading them, and I don't think we have a binary upgrade mode there.

We could invent one of course. Another option might be to exclusively
lock pg_largeobject, and just say that if you do that, we don't have
to lock the individual objects (ever)?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Improvements and additions to COPY progress reporting

2021-03-08 Thread Matthias van de Meent
On Mon, 8 Mar 2021 at 09:24, Michael Paquier  wrote:
>
> On Sun, Mar 07, 2021 at 04:50:31PM +0530, Bharath Rupireddy wrote:
> > Attaching remaining patches 0001 and 0003 from the v11 patch
> > set(posted upthread) here to make cfbot happier.
>
> Looking at patch 0002, the location of each progress report looks good
> to me.  I have some issues with some of the names chosen though, so I
> would like to suggest a few changes to simplify things:
> - PROGRESS_COPY_IO_TYPE_* => PROGRESS_COPY_TYPE_*
> - PROGRESS_COPY_IO_TYPE => PROGRESS_COPY_TYPE
> - PROGRESS_COPY_TYPE_STDIO => PROGRESS_COPY_TYPE_PIPE
> - In pg_stat_progress_copy, io_type => type

Seems reasonable. PFA updated patches. I've renamed the previous 0003
to 0002 to keep git-format-patch easy.

> It seems a bit confusing to not count insertions on foreign tables
> where nothing happened.  I am fine to live with that, but can I ask if
> this has been thought about?

This is keeping current behaviour of the implementation as committed
with 8a4f618e, with the rationale of that patch being that this number
should mirror the number returned by the copy command.

I am not opposed to adding another column for `tuples_inserted` and
changing the logic accordingly (see prototype 0003), but that was not
in the intended scope of this patchset. Unless you think that this
should be included in this current patchset, I'll spin that patch out
into a different thread, but I'm not sure that would make it into
pg14.

With regards,

Matthias van de Meent.
From 94876abe0ab9c28a6f4b0ac006f356251ca4746c Mon Sep 17 00:00:00 2001
From: Matthias van de Meent 
Date: Mon, 8 Mar 2021 16:08:24 +0100
Subject: [PATCH v12 3/3] Adapt COPY progress reporting to report processed and
 inserted tuples

Previously, tuples_processed was implied to be the amount of tuples inserted
in COPY ... FROM. This code is now changed to make tuples_processed count
all tuples that were identified in the COPY ... FROM command, and make
tuples_inserted count all tuples that were inserted into the table (that is,
it excludes the tuples excluded using the WHERE clause, and those that were
not inserted due to triggers, or failure to insert into foreign tables).

This also renumbers the columns to be back in-order, before the stamping of
pg14 makes those numbers effectively immutable
---
 doc/src/sgml/monitoring.sgml | 30 
 src/backend/catalog/system_views.sql | 13 ++-
 src/backend/commands/copyfrom.c  | 34 
 src/include/commands/progress.h  | 13 ++-
 src/test/regress/expected/rules.out  | 13 ++-
 src/test/regress/output/copy.source  |  4 ++--
 6 files changed, 83 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 32bebc70db..aa2e15a748 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6595,6 +6595,24 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   
   
Number of tuples already processed by COPY command.
+   For COPY FROM this includes all tuples from the
+   source material (including tuples excluded by the
+   WHERE clause of the COPY
+   command, and tuples that were not inserted as a result of trigger
+   behaviour). For COPY TO this includes all tuples
+   that exist in the table, or those that are returned by the
+   SELECT.
+  
+ 
+
+ 
+  
+   tuples_inserted bigint
+  
+  
+   Number of tuples inserted into the table with the
+   COPY FROM command. Is 0 for
+   COPY TO.
   
  
 
@@ -6610,6 +6628,18 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 

   
+  
+   
+The tuples_excluded column does not count the tuples
+that failed to insert, but only those tuples that were available in the
+source material and not eligible for insertion through exclusion by the
+WHERE clause. You can calculate the number of tuples
+that failed to insert into the table due to triggers (or that otherwise
+silently fail to insert) by subtracting both
+tuples_excluded and tuples_inserted
+from tuples_processed instead.
+   
+  
  
 
  
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 8f9987f311..f59de36742 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1130,18 +1130,19 @@ CREATE VIEW pg_stat_progress_copy AS
 SELECT
 S.pid AS pid, S.datid AS datid, D.datname AS datname,
 S.relid AS relid,
-CASE S.param5 WHEN 1 THEN 'COPY FROM'
+CASE S.param1 WHEN 1 THEN 'COPY FROM'
   WHEN 2 THEN 'COPY TO'
   END AS command,
-CASE S.param6 WHEN 1 THEN 'FILE'
+CASE S.param2 WHEN 1 THEN 'FILE'
   WHEN 2 THEN 'PROGRAM'
   WHEN 3 THEN 'PIPE'
   WHEN 4 THEN 'CALLBACK'
 

Re: pg_upgrade failing for 200+ million Large Objects

2021-03-08 Thread Tom Lane
Robins Tharakan  writes:
> On Mon, 8 Mar 2021 at 23:34, Magnus Hagander  wrote:
>> Without looking, I would guess it's the schema reload using
>> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
>> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
>> running all of those in a single transaction would be a better choice?
>> One could argue it's still not a proper fix, because we'd still have a
>> huge memory usage etc, but it would then only burn 1 xid instead of
>> 500M...

> (I hope I am not missing something but) When I tried to force pg_restore to
> use a single transaction (by hacking pg_upgrade's pg_restore call to use
> --single-transaction), it too failed owing to being unable to lock so many
> objects in a single transaction.

It does seem that --single-transaction is a better idea than fiddling with
the transaction wraparound parameters, since the latter is just going to
put off the onset of trouble.  However, we'd have to do something about
the lock consumption.  Would it be sane to have the backend not bother to
take any locks in binary-upgrade mode?

regards, tom lane




Re: pg_amcheck contrib application

2021-03-08 Thread Robert Haas
On Thu, Mar 4, 2021 at 5:39 PM Mark Dilger  wrote:
> I think Robert mistook why I was doing that.  I was thinking about a 
> different usage pattern.  If somebody thinks a subset of relations have been 
> badly corrupted, but doesn't know which relations those might be, they might 
> try to find them with pg_amcheck, but wanting to just check the first few 
> blocks per relation in order to sample the relations.  So,
>
>   pg_amcheck --startblock=0 --endblock=9 --no-dependent-indexes
>
> or something like that.  I don't think it's very fun to have it error out for 
> each relation that doesn't have at least ten blocks, nor is it fun to have 
> those relations skipped by error'ing out before checking any blocks, as they 
> might be the corrupt relations you are looking for.  But using --startblock 
> and --endblock for this is not a natural fit, as evidenced by how I was 
> trying to "fix things up" for the user, so I'll punt on this usage until some 
> future version, when I might add a sampling option.

I admit I hadn't thought of that use case. I guess somebody could want
to do that, but it doesn't seem all that useful. Checking the first
up-to-ten blocks of every relation is not a very representative
sample, and it's not clear to me that sampling is a good idea even if
it were representative. What good is it to know that 10% of my
database is probably not corrupted?

On the other hand, people want to do all kinds of things that seem
strange to me, and this might be another one. But, if that's so, then
I think the right place to implement it is in amcheck itself, not
pg_amcheck. I think pg_amcheck should be, now and in the future, a
thin wrapper around the functionality provided by amcheck, just
providing target selection and parallel execution. If you put
something into pg_amcheck that figures out how long the relation is
and runs it on some of the blocks, that functionality is only
accessible to people who are accessing amcheck via pg_amcheck. If you
put it in amcheck itself and just expose it through pg_amcheck, then
it's accessible either way. It's probably cleaner and more performant
to do it that way, too.

So if you did add a sampling option in the future, that's the way I
would recommend doing it, but I think it is probably best not to go
there right now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Let people set host(no)ssl settings from initdb

2021-03-08 Thread Ibrar Ahmed
On Thu, Mar 4, 2021 at 7:25 AM Michael Paquier  wrote:

> On Wed, Mar 03, 2021 at 03:07:30PM +0100, Peter Eisentraut wrote:
> > I think there is enough sustained opposition to this patch that we can
> mark
> > this as rejected in the commitfest.
>
> +1.
> --
> Michael
>

The patch (v5-0001-Enable-setting-pg_hba.conf-permissions-from-initd.patch)
does not apply successfully.
There are two reasons first is it was not generated with proper "-p" which
confused cfbot. Second, after
fixing that issue you still need to rebase that.


http://cfbot.cputube.org/patch_32_2916.log

|+++ doc/src/sgml/ref/initdb.sgml
--
No file to patch.  Skipping patch.
1 out of 1 hunk ignored
can't find file to patch at input line 77
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:


Can we get a rebase?

I am marking the patch "Waiting on Author"


-- 
Ibrar Ahmed


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-08 Thread Mark Dilger



> On Mar 5, 2021, at 11:46 AM, Joel Jacobson  wrote:
> 
> 
> /Joel
> <0003-regexp-positions.patch>

I did a bit more testing:

+SELECT regexp_positions('foobarbequebaz', 'b', 'g');
+ regexp_positions 
+--
+ {"[3,5)"}
+ {"[6,8)"}
+ {"[11,13)"}
+(3 rows)
+

I understand that these ranges are intended to be read as one character long 
matches starting at positions 3, 6, and 11, but they look like they match 
either two or three characters, depending on how you read them, and users will 
likely be confused by that.

+SELECT regexp_positions('foobarbequebaz', '(?=beque)', 'g');
+ regexp_positions 
+--
+ {"[6,7)"}
+(1 row)
+

This is a zero length match.  As above, it might be confusing that a zero 
length match reads this way.

+SELECT regexp_positions('foobarbequebaz', '(?<=z)', 'g');
+ regexp_positions 
+--
+ {"[14,15)"}
+(1 row)
+

Same here, except this time position 15 is referenced, which is beyond the end 
of the string.

I think a zero length match at the end of this string should read as 
{"[14,14)"}, and you have been forced to add one to avoid that collapsing down 
to "empty", but I'd rather you found a different datatype rather than abuse the 
definition of int4range.

It seems that you may have reached a similar conclusion down-thread?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Yet another fast GiST build

2021-03-08 Thread Ibrar Ahmed
On Mon, Mar 8, 2021 at 8:59 PM Peter Geoghegan  wrote:

> On Mon, Mar 8, 2021 at 6:41 AM Ibrar Ahmed  wrote:
> > The patch
> (0001-Add-bool-column-for-LP_DEAF-flag-to-GiST-pageinspect.patch)
> > does not apply successfully and has multiple hanks failed.
>
> That's because it was committed.
>
> Thanks for the clarification, its status was not changed which confused me
:)



> --
> Peter Geoghegan
>


-- 
Ibrar Ahmed


Re: Evaluate expression at planning time for two more cases

2021-03-08 Thread Ibrar Ahmed
On Tue, Nov 24, 2020 at 12:47 PM Surafel Temesgen 
wrote:

> Hi Pavel Borisov,
> It's always good to select the optimal way even if it didn't have
> performance gain
> but in case of this patch i see 4x speed up on my laptop and it will work
> on any
> table that have NULL constraint
>
> regards
> Surafel
>

The patch (null_check_on_pkey_optimization_v2.patch) does not apply
successfully.
http://cfbot.cputube.org/patch_32_2699.log

1 out of 10 hunks FAILED -- saving rejects to file
src/backend/optimizer/util/clauses.c.rej


It was a minor change therefore I rebased the patch, please take a look.

-- 
Ibrar Ahmed


null_check_on_pkey_optimization_v3.patch
Description: Binary data


Re: [PATCH] pg_ownerships system view

2021-03-08 Thread Andreas Karlsson

On 3/7/21 1:08 AM, Joel Jacobson wrote:

Attached is a suggestion of adding a convenience view,
allowing quickly looking up all objects owned by a given user.


This definitely seems like a useful feature. I know I am guilty of 
creating tables as the wrong role more than one time.


Andreas




Re: Yet another fast GiST build

2021-03-08 Thread Peter Geoghegan
On Mon, Mar 8, 2021 at 6:41 AM Ibrar Ahmed  wrote:
> The patch (0001-Add-bool-column-for-LP_DEAF-flag-to-GiST-pageinspect.patch)
> does not apply successfully and has multiple hanks failed.

That's because it was committed.

-- 
Peter Geoghegan




  1   2   >