Re: Clear empty space in a page.

2021-05-29 Thread Fabien COELHO



Hello Yura,


didn't measure impact on raw performance yet.


Must be done. There c/should be a guc to control this behavior if the 
performance impact is noticeable.


--
Fabien.




Re: Regarding the necessity of RelationGetNumberOfBlocks for every rescan / bitmap heap scan.

2021-05-29 Thread Andy Fan
> 1. Why do we need scan->rs_nblocks =
>RelationGextNumberOfBlocks(scan->rs_base.rs_rd) for every rescan, which
> looks
>mismatched with the comments along the code. and the comments looks
>reasonable to me.
> 2. For the heap scan after an IndexScan, we don't need to know the heap
>size, then why do we need to get the nblocks for bitmap heap scan? I
> think the
>similarity between the 2 is that both of them can get a "valid"
> CTID/pages number
>from index scan.  To be clearer, I think for bitmap heap scan, we even
> don't
>need check the RelationGextNumberOfBlocks for the initscan.
> 3. If we need to check nblocks every time,  why Parallel Scan doesn't
> change it
> every time?
>
> shall we remove the RelationGextNumberOfBlocks for bitmap heap scan totally
> and the rescan for normal heap scan?
>
>
yizhi.fzh@e18c07352 /u/y/g/postgres> git diff
diff --git a/src/backend/access/heap/heapam.c
b/src/backend/access/heap/heapam.c
index 6ac07f2fda..6df096fb46 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -246,7 +246,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool
keep_startblock)
bpscan = (ParallelBlockTableScanDesc)
scan->rs_base.rs_parallel;
scan->rs_nblocks = bpscan->phs_nblocks;
}
-   else
+   else if (scan->rs_nblocks == -1 && !(scan->rs_base.rs_flags &
SO_TYPE_BITMAPSCAN))
scan->rs_nblocks =
RelationGetNumberOfBlocks(scan->rs_base.rs_rd);

/*
@@ -1209,6 +1209,7 @@ heap_beginscan(Relation relation, Snapshot snapshot,
scan->rs_base.rs_flags = flags;
scan->rs_base.rs_parallel = parallel_scan;
scan->rs_strategy = NULL;   /* set in initscan */
+   scan->rs_nblocks = -1;


I did the above hacks,  and all the existing tests passed.

>
-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


O_DIRECT on macOS

2021-05-29 Thread Thomas Munro
Hi,

IRIX gave the world O_DIRECT, and then every Unix I've used followed
their lead except Apple's, which gave the world fcntl(fd, F_NOCACHE,
1).  From what I could find in public discussion, this API difference
may stem from the caching policy being controlled at the per-file
(vnode) level in older macOS (and perhaps ancestors), but since 10.4
it's per file descriptor, so approximately like O_DIRECT on other
systems.  The precise effects and constraints of O_DIRECT/F_NOCACHE
are different across operating systems and file systems in some subtle
and not-so-subtle ways, but the general concept is the same: try to
avoid buffering.

I thought about a few different ways to encapsulate this API
difference in PostgreSQL, and toyed with two:

1.  We could define our own fake O_DIRECT flag, and translate that to
the right thing inside BasicOpenFilePerm().  That seems a bit icky.
We'd have to be careful not to collide with system defined flags and
worry about changes.  We do that sort of thing for Windows, though
that's a bit different, there we translate *all* the flags from
POSIXesque to Windowsian.

2.  We could make an extended BasicOpenFilePerm() variant that takes a
separate boolean parameter for direct, so that we don't have to hijack
any flag space, but now we need new interfaces just to tolerate a
rather niche system.

Here's a draft patch like #2, just for discussion.  Better ideas?

The reason I want to get direct I/O working on this "client" OS is
because the AIO project will propose to use direct I/O for the buffer
pool as an option, and I would like Macs to be able to do that
primarily for the sake of developers trying out the patch set.  Based
on memories from the good old days of attending conferences, a decent
percentage of PostgreSQL developers are on Macs.

As it stands, the patch only actually has any effect if you set
wal_level=minimal and max_wal_senders=0, which is a configuration that
I guess almost no-one uses.  Otherwise xlog.c assumes that the
filesystem is going to be used for data exchange with replication
processes (something we should replace with WAL buffers in shmem some
time soon) so for now it's better to keep the data in page cache since
it'll be accessed again soon.

Unfortunately, this change makes pg_test_fsync show a very slightly
lower number for open_data_sync on my ancient Intel Mac, but
pg_test_fsync isn't really representative anymore since minimal
logging is by now unusual (I guess pg_test_fsync would ideally do the
test with and without direct to make that clearer).  Whether this is a
good option for the WAL is separate from whether it's a good option
for relation data (ie a way to avoid large scale double buffering, but
have new, different problems), and later patches will propose new
separate GUCs to control that.


0001-Support-direct-I-O-on-macOS.patch
Description: Binary data


Clear empty space in a page.

2021-05-29 Thread Yura Sokolov

Good day.

Long time ago I've been played with proprietary "compressed storage"
patch on heavily updated table, and found empty pages (ie cleaned by
vacuum) are not compressed enough.

When table is stress-updated, page for new row versions are allocated
in round-robin kind, therefore some 1GB segments contains almost
no live tuples. Vacuum removes dead tuples, but segments remains large
after compression (>400MB) as if they are still full.

After some investigation I found it is because PageRepairFragmentation,
PageIndex*Delete* don't clear space that just became empty therefore it
still contains garbage data. Clearing it with memset greatly increase
compression ratio: some compressed relation segments become 30-60MB just
after vacuum remove tuples in them.

While this result is not directly applied to stock PostgreSQL, I believe
page compression is important for full_page_writes with wal_compression
enabled. And probably when PostgreSQL is used on filesystem with
compression enabled (ZFS?).

Therefore I propose clearing page's empty space with zero in
PageRepairFragmentation, PageIndexMultiDelete, PageIndexTupleDelete and
PageIndexTupleDeleteNoCompact.

Sorry, didn't measure impact on raw performance yet.

regards,
Yura Sokolov aka funny_falconcommit 6abfcaeb87fcb396c5e2dccd434ce2511314ff76
Author: Yura Sokolov 
Date:   Sun May 30 02:39:17 2021 +0300

Clear empty space in a page

Write zeroes to just cleared space in PageRepairFragmentation,
PageIndexTupleDelete, PageIndexMultiDelete and PageIndexDeleteNoCompact.

It helps increase compression ration on compression enabled filesystems
and with full_page_write and wal_compression enabled.

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 82ca91f5977..7deb6cc71a4 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -681,6 +681,17 @@ compactify_tuples(itemIdCompact itemidbase, int nitems, Page page, bool presorte
 	phdr->pd_upper = upper;
 }
 
+/*
+ * Clean up space between pd_lower and pd_upper for better page compression.
+ */
+static void
+memset_hole(Page page, LocationIndex old_pd_upper)
+{
+	PageHeader	phdr = (PageHeader) page;
+	if (phdr->pd_upper > old_pd_upper)
+		MemSet((char *)page + old_pd_upper, 0, phdr->pd_upper - old_pd_upper);
+}
+
 /*
  * PageRepairFragmentation
  *
@@ -797,6 +808,7 @@ PageRepairFragmentation(Page page)
 
 		compactify_tuples(itemidbase, nstorage, page, presorted);
 	}
+	memset_hole(page, pd_upper);
 
 	/* Set hint bit for PageAddItemExtended */
 	if (nunused > 0)
@@ -1114,6 +1126,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
 
 	if (offset > phdr->pd_upper)
 		memmove(addr + size, addr, offset - phdr->pd_upper);
+	MemSet(addr, 0, size);
 
 	/* adjust free space boundary pointers */
 	phdr->pd_upper += size;
@@ -1271,6 +1284,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 		compactify_tuples(itemidbase, nused, page, presorted);
 	else
 		phdr->pd_upper = pd_special;
+	memset_hole(page, pd_upper);
 }
 
 
@@ -1351,6 +1365,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
 
 	if (offset > phdr->pd_upper)
 		memmove(addr + size, addr, offset - phdr->pd_upper);
+	MemSet(addr, 0, size);
 
 	/* adjust free space boundary pointer */
 	phdr->pd_upper += size;


Re: security_definer_search_path GUC

2021-05-29 Thread Marko Tiikkaja
On Sat, May 29, 2021 at 11:06 PM Joel Jacobson  wrote:

> Glad you bring this problem up for discussion, something should be done to
> improve the situation.
>
> Personally, as I really dislike search_path and consider using it an
> anti-pattern.
> I would rather prefer a GUC to hard-code search_path to a constant default
> value of just ‘public’ that cannot be changed by anyone or any function.
> Trying to change it to a different value would raise an exception.
>

That would work, too!  I think it's a nice idea, perhaps even better than
what I proposed.  I would be happy to see either one incorporated.


.m


Re: security_definer_search_path GUC

2021-05-29 Thread Joel Jacobson
Glad you bring this problem up for discussion, something should be done to 
improve the situation.

Personally, as I really dislike search_path and consider using it an 
anti-pattern.
I would rather prefer a GUC to hard-code search_path to a constant default 
value of just ‘public’ that cannot be changed by anyone or any function. Trying 
to change it to a different value would raise an exception.

This would work for me since I always fully-qualify all objects except the ones 
in public.

/Joel

On Thu, May 27, 2021, at 13:23, Marko Tiikkaja wrote:
> Hi,
> 
> Since writing SECURITY DEFINER functions securely requires annoying 
> incantations[1], wouldn't it be nice if we provided a way for the superuser 
> to override the default search path via a GUC in postgresql.conf?  That way 
> you can set search_path if you want to override the default, but if you leave 
> it out you're not vulnerable, assuming security_definer_search_path only 
> contains secure schemas.
> 
> 
> .m

Kind regards,

Joel


Re: fdatasync performance problem with large number of DB files

2021-05-29 Thread Justin Pryzby
On Tue, May 25, 2021 at 07:13:59PM -0500, Justin Pryzby wrote:
> On Sat, Mar 20, 2021 at 12:16:27PM +1300, Thomas Munro wrote:
> > > > + {
> > > > + {"recovery_init_sync_method", PGC_POSTMASTER, 
> > > > ERROR_HANDLING_OPTIONS,
> > > > + gettext_noop("Sets the method for synchronizing 
> > > > the data directory before crash recovery."),
> > > > + },
> 
> Is there any reason why this can't be PGC_SIGHUP ?
> (Same as restart_after_crash, remove_temp_files_after_crash)

I can't see any reason why this is nontrivial.
What about data_sync_retry?

commit 2d2d2e10f99548c486b50a1ce095437d558e8649
Author: Justin Pryzby 
Date:   Sat May 29 13:41:14 2021 -0500

Change recovery_init_sync_method to PGC_SIGHUP..

The setting has no effect except during startup.
But it's nice to be able to change the setting dynamically, which is 
expected
to be pretty useful to an admin following crash recovery when turning the
service off and on again is not so appealing.

See also: 2941138e6, 61752afb2

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d8c0fd3315..ab9916eac5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9950,7 +9950,8 @@ dynamic_library_path = 
'C:\tools\postgresql;H:\my_project\lib;$libdir'
 appear only in kernel logs.


-This parameter can only be set at server start.
+This parameter can only be set in the 
postgresql.conf
+file or on the server command line.

   
  
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 87bc688704..796b4e83ce 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -4945,7 +4945,7 @@ static struct config_enum ConfigureNamesEnum[] =
},
 
{
-   {"recovery_init_sync_method", PGC_POSTMASTER, 
ERROR_HANDLING_OPTIONS,
+   {"recovery_init_sync_method", PGC_SIGHUP, 
ERROR_HANDLING_OPTIONS,
gettext_noop("Sets the method for synchronizing the 
data directory before crash recovery."),
},
_init_sync_method,
diff --git a/src/backend/utils/misc/postgresql.conf.sample 
b/src/backend/utils/misc/postgresql.conf.sample
index ddbb6dc2be..9c4c4a9eec 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -774,7 +774,6 @@
# data?
# (change requires restart)
 #recovery_init_sync_method = fsync # fsync, syncfs (Linux 5.8+)
-   # (change requires restart)
 
 
 #--




Re: CALL versus procedures with output-only arguments

2021-05-29 Thread Tom Lane
Here's a stripped-down patch that drops the change in what should be
in CALL argument lists, and just focuses on reverting the change in
pg_proc.proargtypes and the consequent mess for ALTER/DROP ROUTINE.
I spent some more effort on the docs, too.

regards, tom lane

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 16493209c6..f517a7d4af 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5905,9 +5905,8 @@ SCRAM-SHA-256$iteration count:
   
An array of the data types of the function arguments.  This includes
only input arguments (including INOUT and
-   VARIADIC arguments), as well as
-   OUT parameters of procedures, and thus represents
-   the call signature of the function or procedure.
+   VARIADIC arguments), and thus represents
+   the call signature of the function.
   
  
 
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 52f60c827c..0cd6e8b071 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -480,7 +480,7 @@ $$ LANGUAGE plpgsql;
 
  
   To call a function with OUT parameters, omit the
-  output parameter in the function call:
+  output parameter(s) in the function call:
 
 SELECT sales_tax(100.00);
 
@@ -523,16 +523,20 @@ $$ LANGUAGE plpgsql;
 
 
   In a call to a procedure, all the parameters must be specified.  For
-  output parameters, NULL may be specified.
+  output parameters, NULL may be specified when
+  calling the procedure from plain SQL:
 
 CALL sum_n_product(2, 4, NULL, NULL);
  sum | prod
 -+--
6 |8
 
-  Output parameters in procedures become more interesting in nested calls,
-  where they can be assigned to variables.  See  for details.
+
+  However, when calling a procedure
+  from PL/pgSQL, you should instead write a
+  variable for any output parameter; the variable will receive the result
+  of the call.  See 
+  for details.
  
 
  
@@ -2030,6 +2034,9 @@ BEGIN
 END;
 $$;
 
+ The variable corresponding to an output parameter can be a simple
+ variable or a field of a composite-type variable.  Currently,
+ it cannot be an element of an array.
 

 
diff --git a/doc/src/sgml/ref/alter_extension.sgml b/doc/src/sgml/ref/alter_extension.sgml
index 38fd60128b..c819c7bb4e 100644
--- a/doc/src/sgml/ref/alter_extension.sgml
+++ b/doc/src/sgml/ref/alter_extension.sgml
@@ -212,12 +212,11 @@ ALTER EXTENSION name DROP IN, OUT,
INOUT, or VARIADIC.
If omitted, the default is IN.
-   Note that ALTER EXTENSION does not actually pay any
-   attention to OUT arguments for functions and
-   aggregates (but not procedures), since only the input arguments are
-   needed to determine the function's identity.  So it is sufficient to
-   list the IN, INOUT, and
-   VARIADIC arguments for functions and aggregates.
+   Note that ALTER EXTENSION does not actually pay
+   any attention to OUT arguments, since only the input
+   arguments are needed to determine the function's identity.
+   So it is sufficient to list the IN, INOUT,
+   and VARIADIC arguments.
   
  
 
diff --git a/doc/src/sgml/ref/call.sgml b/doc/src/sgml/ref/call.sgml
index abaa81c78b..dbad89bbf7 100644
--- a/doc/src/sgml/ref/call.sgml
+++ b/doc/src/sgml/ref/call.sgml
@@ -55,9 +55,21 @@ CALL name ( [ argument
 
  
-  An input argument for the procedure call.
-  See  for the full details on
-  function and procedure call syntax, including use of named parameters.
+  An argument expression for the procedure call.
+ 
+
+ 
+  Arguments can include parameter names, using the syntax
+  name = value.
+  This works the same as in ordinary function calls; see
+   for details.
+ 
+
+ 
+  Arguments must be supplied for all procedure parameters that lack
+  defaults, including OUT parameters.  However,
+  arguments matching OUT parameters are not evaluated,
+  so it's customary to just write NULL for them.
  
 

diff --git a/doc/src/sgml/ref/comment.sgml b/doc/src/sgml/ref/comment.sgml
index eda91b4e24..6e8ced3eaf 100644
--- a/doc/src/sgml/ref/comment.sgml
+++ b/doc/src/sgml/ref/comment.sgml
@@ -178,12 +178,11 @@ COMMENT ON
   argument: IN, OUT,
   INOUT, or VARIADIC.
   If omitted, the default is IN.
-  Note that COMMENT does not actually pay any attention
-  to OUT arguments for functions and aggregates (but
-  not procedures), since only the input arguments are needed to determine
-  the function's identity.  So it is sufficient to list the
-  IN, INOUT, and
-  VARIADIC arguments for functions and aggregates.
+  Note that COMMENT does not actually pay
+  any attention to OUT arguments, since only the input
+  arguments are needed to determine the function's identity.
+  So 

Re: CREATE COLLATION - check for duplicate options and error out if found one

2021-05-29 Thread Bharath Rupireddy
On Sat, May 29, 2021 at 9:08 PM vignesh C  wrote:
> One minor comment:
> You can remove the brackets around errcode, You could change:
> + if (localeEl)
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("option \"%s\" specified more than once", defel->defname),
> + parser_errposition(pstate, defel->location)));
> to:
> + if (localeEl)
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("option \"%s\" specified more than once", defel->defname),
> + parser_errposition(pstate, defel->location));

Thanks. PSA v3 patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v3-0001-Check-duplicate-options-and-error-out-for-CREATE-.patch
Description: Binary data


Re: Addition of alias types regpublication and regsubscription

2021-05-29 Thread Tom Lane
vignesh C  writes:
> I felt inclusion of alias types regpublication and regsubscription will
> help the logical replication users.

This doesn't really seem worth the trouble --- how often would you
use these?

If we had a policy of inventing reg* aliases for every kind of catalog
object, that'd be one thing, but we don't.  (And the overhead in
inventing new object kinds is already high enough, so I'm not in favor
of creating such a policy.)

regards, tom lane




Re: CREATE COLLATION - check for duplicate options and error out if found one

2021-05-29 Thread vignesh C
On Wed, May 26, 2021 at 7:44 PM Bharath Rupireddy
 wrote:
>
> On Wed, May 26, 2021 at 7:18 PM vignesh C  wrote:
> > +1 for fixing this issue, we have handled this error in other places.
> > The patch does not apply on head, could you rebase the patch on head
> > and post a new patch.
>
> Thanks. I thought of rebasing once the other patch (which reorganizes
> "...specified more than once" error) gets committed. Anyways, I've
> rebased for now on the latest master. Please review v2 patch.
>

Thanks for the updated patch.
One minor comment:
You can remove the brackets around errcode, You could change:
+ if (localeEl)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option \"%s\" specified more than once", defel->defname),
+ parser_errposition(pstate, defel->location)));
to:
+ if (localeEl)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option \"%s\" specified more than once", defel->defname),
+ parser_errposition(pstate, defel->location));

Regards,
Vignesh




Addition of alias types regpublication and regsubscription

2021-05-29 Thread vignesh C
Hi,

I felt inclusion of alias types regpublication and regsubscription will
help the logical replication users. This will also help in [1].
The alias types allow simplified lookup of publication oid values for
objects. For example, to examine the pg_publication_rel rows, one could
write:
SELECT prpubid::regpublication, prrelid::regclass FROM pg_publication_rel;

rather than:
SELECT p.pubname, prrelid::regclass FROM pg_publication_rel pr,
pg_publication p WHERE pr.prpubid = p.oid;

Similarly in case of subscription:
For example, to examine the pg_subscription_rel rows, one could write:
SELECT srsubid::regsubscription, srrelid::regclass FROM pg_subscription_rel;

rather than:
SELECT s.subname,srsubid::regclass FROM pg_subscription_rel sr,
pg_subscription s where sr.srsubid = s.oid;

Attached patch has the changes for the same.
Thoughts?

[1] -
https://www.postgresql.org/message-id/flat/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ%40mail.gmail.com

Regards,
Vignesh
From 3ef5874b4e3b65b81199dbf878f53d820940be96 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Wed, 19 May 2021 10:53:29 +0530
Subject: [PATCH v1 1/2] Implement type regpublication

It will help in casting publication information and it's also just generally
useful, like the other reg* types.
---
 doc/src/sgml/datatype.sgml|  11 +++
 doc/src/sgml/func.sgml|  18 
 doc/src/sgml/ref/pgupgrade.sgml   |   1 +
 src/backend/utils/adt/regproc.c   | 126 ++
 src/bin/pg_upgrade/check.c|   3 +-
 src/include/catalog/pg_cast.dat   |  14 +++
 src/include/catalog/pg_proc.dat   |  15 +++
 src/include/catalog/pg_type.dat   |   4 +
 src/test/regress/expected/regproc.out |  55 +++
 src/test/regress/sql/regproc.sql  |  16 
 10 files changed, 262 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 0e8ef958a9..dd578f3535 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4603,6 +4603,10 @@ INSERT INTO mytable VALUES(-1);  -- fails
 regprocedure

 
+   
+regpublication
+   
+

 regrole

@@ -4763,6 +4767,13 @@ SELECT * FROM pg_attribute
 sum(int4)

 
+   
+regpublication
+pg_publication
+publication name
+testpub
+   
+

 regrole
 pg_authid
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 08b07f561e..50da778b06 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23596,6 +23596,24 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ to_regpublication
+
+to_regpublication ( text )
+regpublication
+   
+   
+Translates a textual publication name to its OID.  A similar result is
+obtained by casting the string to type regpublication (see
+); however, this function will return
+NULL rather than throwing an error if the name is
+not found.  Also unlike the cast, this does not accept
+a numeric OID as input.
+   
+  
+
   

 
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index a83c63cd98..bb2d908da6 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -785,6 +785,7 @@ psql --username=postgres --file=script.sql postgres
 regoperator
 regproc
 regprocedure
+regpublication

(regclass, regrole, and regtype can be upgraded.)
   
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index f998fe2076..dfa826f1c2 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -27,6 +27,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_proc.h"
+#include "catalog/pg_publication.h"
 #include "catalog/pg_ts_config.h"
 #include "catalog/pg_ts_dict.h"
 #include "catalog/pg_type.h"
@@ -1231,6 +1232,131 @@ regcollationsend(PG_FUNCTION_ARGS)
 	return oidsend(fcinfo);
 }
 
+/*
+ * regpublicationin		- converts "publicationname" to publication OID
+ *
+ * We also accept a numeric OID, for symmetry with the output routine.
+ *
+ * '-' signifies unknown (OID 0).  In all other cases, the input must
+ * match an existing pg_publication entry.
+ */
+Datum
+regpublicationin(PG_FUNCTION_ARGS)
+{
+	char	   *publication_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result = InvalidOid;
+	List	   *names;
+
+	/* '-' ? */
+	if (strcmp(publication_name_or_oid, "-") == 0)
+		PG_RETURN_OID(InvalidOid);
+
+	/* Numeric OID? */
+	if (publication_name_or_oid[0] >= '0' &&
+		publication_name_or_oid[0] <= '9' &&
+		strspn(publication_name_or_oid, "0123456789") == strlen(publication_name_or_oid))
+	{
+		result = DatumGetObjectId(DirectFunctionCall1(oidin,
+	  CStringGetDatum(publication_name_or_oid)));
+		PG_RETURN_OID(result);
+	}
+
+	/* The rest of 

Re: be-secure-gssapi.c and auth.c with setenv() not compatible on Windows

2021-05-29 Thread Tom Lane
Michael Paquier  writes:
> On Fri, May 28, 2021 at 11:37:22AM -0400, Tom Lane wrote:
>> It's not clear to me how much of 7ca37fb you're envisioning
>> back-patching in (2).  I think it'd be best to back-patch
>> only the addition of pgwin32_setenv, and then let the gssapi
>> code use it.  In that way, if there's anything wrong with
>> pgwin32_setenv, we're only breaking code that never worked
>> on Windows before anyway.

> Just to be clear, for 2) I was thinking to pick up the minimal parts
> you have changed in win32env.c and add src/port/setenv.c to add the
> fallback implementation of setenv(), without changing anything else.
> This also requires grabbing the small changes within pgwin32_putenv(),
> visibly.

What I had in mind was to *only* add pgwin32_setenv, not setenv.c.
There's no evidence that any other modern platform lacks setenv.
Moreover, there's no issue in these branches unless your platform
lacks setenv yet has GSS support.

regards, tom lane




Re: Decoding speculative insert with toast leaks memory

2021-05-29 Thread Tomas Vondra
On 5/29/21 6:29 AM, Amit Kapila wrote:
> On Fri, May 28, 2021 at 5:16 PM Tomas Vondra
>  wrote:
>>
>> I wonder if there's a way to free the TOASTed data earlier, instead of
>> waiting until the end of the transaction (as this patch does).
>>
> 
> IIUC we are anyway freeing the toasted data at the next
> insert/update/delete. We can try to free at other change message types
> like REORDER_BUFFER_CHANGE_MESSAGE but as you said that may make the
> patch more complex, so it seems better to do the fix on the lines of
> what is proposed in the patch.
> 

+1

Even if we started doing what you mention (freeing the hash for other
change types), we'd still need to do what the patch proposes because the
speculative insert may be the last change in the transaction. For the
other cases it works as a mitigation, so that we don't leak the memory
forever.

So let's get this committed, perhaps with a comment explaining that it
might be possible to reset earlier if needed.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: be-secure-gssapi.c and auth.c with setenv() not compatible on Windows

2021-05-29 Thread Michael Paquier
On Fri, May 28, 2021 at 11:37:22AM -0400, Tom Lane wrote:
> There's a lot of value in keeping the branches looking alike.
> On the other hand, 7ca37fb hasn't survived contact with the
> public yet, so I'm a bit nervous about it.

I don't think this set of complications is worth the risk
destabilizing those stable branches.

> It's not clear to me how much of 7ca37fb you're envisioning
> back-patching in (2).  I think it'd be best to back-patch
> only the addition of pgwin32_setenv, and then let the gssapi
> code use it.  In that way, if there's anything wrong with
> pgwin32_setenv, we're only breaking code that never worked
> on Windows before anyway.

Just to be clear, for 2) I was thinking to pick up the minimal parts
you have changed in win32env.c and add src/port/setenv.c to add the
fallback implementation of setenv(), without changing anything else.
This also requires grabbing the small changes within pgwin32_putenv(),
visibly.
--
Michael


signature.asc
Description: PGP signature


Re: Skipping logical replication transactions on subscriber side

2021-05-29 Thread Amit Kapila
On Sat, May 29, 2021 at 8:27 AM Masahiko Sawada  wrote:
>
> On Thu, May 27, 2021 at 7:04 PM Amit Kapila  wrote:
> >
> > On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada  
> > wrote:
> > >
> > > 1. the worker records the XID and commit LSN of the failed transaction
> > > to a catalog.
> > >
> >
> > When will you record this info? I am not sure if we can try to update
> > this when an error has occurred. We can think of using try..catch in
> > apply worker and then record it in catch on error but would that be
> > advisable? One random thought that occurred to me is to that apply
> > worker notifies such information to the launcher (or maybe another
> > process) which will log this information.
>
> Yeah, I was concerned about that too and had the same idea. The
> information still could not be written if the server crashes before
> the launcher writes it. But I think it's an acceptable.
>

True, because even if the launcher restarts, the apply worker will
error out again and resend the information. I guess we can have an
error queue where apply workers can add their information and the
launcher will then process those.  If we do that, then we need to
probably define what we want to do if the queue gets full, either
apply worker nudge launcher and wait or it can just throw an error and
continue. If you have any better ideas to share this information then
we can consider those as well.

> >
> > > 2. the user specifies how to resolve that conflict transaction
> > > (currently only 'skip' is supported) and writes to the catalog.
> > > 3. the worker does the resolution method according to the catalog. If
> > > the worker didn't start to apply those changes, it can skip the entire
> > > transaction. If did, it rollbacks the transaction and ignores the
> > > remaining.
> > >
> > > The worker needs neither to reset information of the last failed
> > > transaction nor to mark the conflicted transaction as resolved. The
> > > worker will ignore that information when checking the catalog if the
> > > commit LSN is passed.
> > >
> >
> > So won't this require us to check the required info in the catalog
> > before applying each transaction? If so, that might be overhead, maybe
> > we can build some cache of the highest commitLSN that can be consulted
> > rather than the catalog table.
>
> I think workers can cache that information when starts and invalidates
> and reload the cache when the catalog gets updated.  Specifying to
> skip XID will update the catalog, invalidating the cache.
>
> > I think we need to think about when to
> > remove rows for which conflict has been resolved as we can't let that
> > information grow infinitely.
>
> I guess we can update catalog tuples in place when another conflict
> happens next time. The catalog tuple should be fixed size. The
> already-resolved conflict will have the commit LSN older than its
> replication origin's LSN.
>

Okay, but I have a slight concern that we will keep xid in the system
which might have been no longer valid. So, we will keep this info
about subscribers around till one performs drop subscription,
hopefully, that doesn't lead to too many rows. This will be okay as
per the current design but say tomorrow we decide to parallelize the
apply for a subscription then there could be multiple errors
corresponding to a subscription and in that case, such a design might
appear quite limiting. One possibility could be that when the launcher
is periodically checking for new error messages, it can clean up the
conflicts catalog as well, or maybe autovacuum does this periodically
as it does for stats (via pgstat_vacuum_stat).

-- 
With Regards,
Amit Kapila.