date:20201211

On Sat, Dec 12, 2020 at 12:19 AM Fujii Masao 
wrote:
> I was thinking that in the case of drop of user mapping or server,
hash_search(ConnnectionHash) in GetConnection() cannot find the cached
connection entry invalidated by that drop. Because "user->umid" used as
hash key is changed. So I was thinking that that invalidated connection
will not be closed nor reconnected.
>

You are right in saying that the connection leaks.

Use case 1:
1) Run foreign query in session1 with server1, user mapping1
2) Drop user mapping1 in another session2, invalidation message gets logged
which will have to be processed by other sessions
3) Run foreign query again in session1, at the start of txn, the cached
entry gets invalidated via pgfdw_inval_callback(). Whatever may be the type
of foreign query (select, update, explain, delete, insert, analyze etc.),
upon next call to GetUserMapping() from postgres_fdw.c, the cache lookup
fails(with ERROR:  user mapping not found for "") since the user
mapping1 has been dropped in session2 and the query will also fail before
reaching GetConnection() where the connections associated with invalidated
entries would have got disconnected.

So, the connection associated with invalidated entry would remain until the
local session exits which is a problem to solve.

Use case 2:
1) Run foreign query in session1 with server1, user mapping1
2) Try to drop foreign server1, then we would not be allowed to do so
because of dependency. If we use CASCADE, then the dependent user mapping1
and foreign tables get dropped too [1].
3) Run foreign query again in session1, at the start of txn, the cached
entry gets invalidated via pgfdw_inval_callback(), it fails because there
is no foreign table and user mapping1.

But, note that the connection remains open in session1, which is again a
problem to solve.

To solve the above connection leak problem, it looks like the right place
to close all the invalid connections is pgfdw_xact_callback(), once
registered, which gets called at the end of every txn in the current
session(by then all the sub txns also would have been finished). Note that
if there are too many invalidated entries, then one of the following txn
has to bear running this extra code, but that's okay than having leaked
connections. Thoughts? If okay, I can code a separate patch.

static void
pgfdw_xact_callback(XactEvent event, void *arg)
{
HASH_SEQ_STATUS scan;
ConnCacheEntry *entry;
* /* HERE WE CAN LOOK FOR ALL INVALIDATED ENTRIES AND DISCONNECT THEM*/*
/* Quick exit if no connections were touched in this transaction. */
if (!xact_got_connection)
return;

And we can also extend postgres_fdw_disconnect() something like.

postgres_fdw_disconnect(bool invalid_only) --> default for invalid_only
false. disconnects all connections. when invalid_only is set to true then
disconnects only invalid connections.
postgres_fdw_disconnect('server_name') --> disconnections connections
associated with the specified foreign server

Having said this, I'm not in favour of invalid_only flag, because if we
choose to change the code in pgfdw_xact_callback to solve connection leak
problem, we may not need this invalid_only flag at all, because at the end
of txn (even for the txns in which the queries fail with error,
pgfdw_xact_callback gets called), all the existing invalid connections get
disconnected. Thoughts?

[1]
postgres=# drop server loopback1 ;
ERROR:  cannot drop server loopback1 because other objects depend on it
DETAIL:  user mapping for bharath on server loopback1 depends on server
loopback1
foreign table f1 depends on server loopback1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

postgres=# drop server loopback1 CASCADE ;
NOTICE:  drop cascades to 2 other objects
DETAIL:  drop cascades to user mapping for bharath on server loopback1
drop cascades to foreign table f1
DROP SERVER

> >  if (entry->conn != NULL && entry->invalidated && entry->xact_depth
== 0)
> >  {
> >  elog(DEBUG3, "closing connection %p for option changes to take
effect",
> >   entry->conn);
> >  disconnect_pg_server(entry);
> >  }
> >
> >> If so, this seems like a connection-leak bug, at least for me
Thought?
> >>
> >
> > It's not a leak. The comment before pgfdw_inval_callback() [1]
> > explains why we can not immediately close/disconnect the connections
> > in pgfdw_inval_callback() after marking them as invalidated.
>
> *If* invalidated connection cannot be close immediately even in the case
of drop of server or user mapping, we can defer it to the subsequent call
to GetConnection(). That is, GetConnection() closes not only the target
invalidated connection but also the other all invalidated connections. Of
course, invalidated connections will remain until subsequent
GetConnection() is called, though.
>

I think my detailed response to the above comment clarifies this.

> > Here is the scenario how in the midst of a txn we get invalidation
> >

Re: pg_basebackup test coverage

2020-12-11 Thread Noah Misch

On Fri, Dec 11, 2020 at 12:23:10PM -0500, Robert Haas wrote:
> On Fri, Dec 11, 2020 at 3:04 AM Michael Paquier  wrote:
> > Why don't you just use Archive::Tar [1] instead of looking for some system
> > commands?  Combining list_files() with extract(), it is possible to
> > extract an archive, with or without compression, without hoping for an
> > equivalent to exist on Windows.  That would not be the first time
> > either that there is a TAP test that skips some tests if a module does
> > not exist.  See for example what psql does with IO::Pty.
> 
> Well, either this or Noah's method has the disadvantage that not
> everyone will get the benefit of the tests, and that those who wish to
> get that benefit must install more stuff. But, all three of the
> computers I have within arm's reach (yeah, I might have a problem)
> have Archive::Tar installed, so maybe it's not a big concern in
> practice. I am slightly inclined to think that the perl package
> approach might be better than shell commands because perhaps it is
> more likely to work on Windows, but I'm not positive.

Outside Windows, Archive::Tar is less portable.  For example, in the forty-two
systems of the GCC Compile Farm, five lack Archive::Tar.  (Each of those five
is a CentOS 7 system.  Every system does have tar, gzip and gunzip.)

Either way is fine with me.  Favoring Archive::Tar, a Windows-specific bug is
more likely than a CentOS/RHEL-specific bug.  Favoring shell commands, they
can catch PostgreSQL writing a tar file that the system's tar can't expand.

Re: Rethinking plpgsql's assignment implementation

I wrote:
> In any case, that approach still involves inserting some query text
> that the user didn't write, so I'm not sure how much confusion it'd
> remove.  The "SET n:" business at least looks like it's some weird
> prefix comparable to "LINE n:", so while people wouldn't understand
> it I think they'd easily see it as something the system prefixed
> to their query.

> Looking a bit ahead, it's not too hard to imagine plpgsql wishing
> to pass other sorts of annotations through SPI and down to the core
> parser.  Maybe we should think about a more general way to do that
> in an out-of-band, not-visible-in-the-query-text fashion.

I have an idea (no code written yet) about this.

After looking around, it seems like the ParserSetupHook mechanism
is plenty for anything we might want an extension to be able to
change in the behavior of parse analysis.  The hooks that we
currently allow that to set affect only the interpretation of
variable names and $N parameter symbols, but we could surely
add much more in that line as needed.

What we lack is any good way for an extension to control the
behavior of raw_parser() (i.e., gram.y).  Currently, plpgsql
prefixes "SELECT " to expressions it might want to parse, and
now my current patch proposes to prefix something else to get a
different grammar behavior.  Another example of a very similar
problem is typeStringToTypeName(), which prefixes a string it
expects to be a type name with "SELECT NULL::", and then has
to do a bunch of kluges to deal with the underspecification
involved in that.  Based on these examples, we need some sort
of "overall goal" option for the raw parser, but maybe not more
than that --- other things you might want tend to fall into the
parse analysis side of things.

So my idea here is to add a parsing-mode option to raw_parser(),
which would be an enum with values like "normal SQL statement",
"expression only", "type name", "plpgsql assignment statement".
The problem I had with not knowing how many dotted names to
absorb at the start of an assignment statement could be finessed
by inventing "assignment1", "assignment2", and "assignment3"
parsing modes; that's a little bit ugly but not enough to make
me think we need a wider API.

As to how it could actually work, I'm noticing that raw_parser
starts out by initializing yyextra's lookahead buffer to empty.
For the parsing modes other than "normal SQL statement", it
could instead inject a lookahead token that is a code that cannot
be generated by the regular lexer.  Then gram.y could have
productions like

EXPRESSION_MODE a_expr { ... generate parse tree ... }

where EXPRESSION_MODE is one of these special tokens.  And now
we have something that will parse an a_expr, and only an a_expr,
and we don't need any special "SELECT " or any other prefix in
the user-visible source string.  Similarly for the other special
parsing modes.

Essentially, this is a way of having a few distinct parsers
that share a common base of productions, without the bloat and
code maintenance issues of building actually-distinct parsers.

A small problem with this is that the images of these special
productions in ECPG would be dead code so far as ECPG is concerned.
For the use-cases I can foresee, there wouldn't be enough special
productions for that to be a deal-breaker.  But we could probably
teach the ECPG grammar-building scripts to filter out these
productions if it ever got to be annoying.

regards, tom lane

Re: please update ps display for recovery checkpoint

2020-12-11 Thread Bossart, Nathan

On 12/11/20, 4:00 PM, "Michael Paquier"  wrote:
> On Fri, Dec 11, 2020 at 06:54:42PM +, Bossart, Nathan wrote:
>> This approach seems reasonable to me.  I've attached my take on it.
>
> +   /* Reset the process title */
> +   set_ps_display("");
> I would still recommend to avoid calling set_ps_display() if there is
> no need to so as we avoid useless system calls, so I think that this
> stuff had better use a common path for the set and reset logic.
>
> My counter-proposal is like the attached, with the set/reset part not
> reversed this time, and the code indented :p

Haha.  LGTM.

Nathan

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

On Fri, Dec 11, 2020 at 05:27:03PM -0300, Alvaro Herrera wrote:
> By the way--  What did you think of the idea of explictly marking the
> types used for bitmasks using types bits32 and friends, instead of plain
> int, which is harder to spot?

Right, we could just do that while the area is changed.  It is worth
noting that all the REINDEX_REL_* handling could be brushed.  Another
point that has been raised on a recent thread by Peter was that people
preferred an hex style for the declarations rather than bit shifts.
What do you think?
--
Michael


signature.asc
Description: PGP signature

Re: please update ps display for recovery checkpoint

On Fri, Dec 11, 2020 at 06:54:42PM +, Bossart, Nathan wrote:
> This approach seems reasonable to me.  I've attached my take on it.

+   /* Reset the process title */
+   set_ps_display("");
I would still recommend to avoid calling set_ps_display() if there is
no need to so as we avoid useless system calls, so I think that this
stuff had better use a common path for the set and reset logic.

My counter-proposal is like the attached, with the set/reset part not
reversed this time, and the code indented :p
--
Michael
From cb7c41b70ff798870f2eddcaa2782bd47b5fd57f Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Sat, 12 Dec 2020 08:51:48 +0900
Subject: [PATCH v9] Add checkpoint/restartpoint status to ps display.

---
 src/backend/access/transam/xlog.c | 45 +++
 1 file changed, 45 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7e81ce4f17..8dd225c2e1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8687,6 +8687,39 @@ UpdateCheckPointDistanceEstimate(uint64 nbytes)
 			(0.90 * CheckPointDistanceEstimate + 0.10 * (double) nbytes);
 }
 
+/*
+ * Update the ps display for a process running a checkpoint.  Note that
+ * this routine should not do any allocations so as it can be called
+ * from a critical section.
+ */
+static void
+update_checkpoint_display(int flags, bool restartpoint, bool reset)
+{
+	/*
+	 * The status is reported only for end-of-recovery and shutdown
+	 * checkpoints or shutdown restartpoints.  Updating the ps display is
+	 * useful in those situations as it may not be possible to rely on
+	 * pg_stat_activity to see the status of the checkpointer or the startup
+	 * process.
+	 */
+	if ((flags & (CHECKPOINT_END_OF_RECOVERY | CHECKPOINT_IS_SHUTDOWN)) == 0)
+		return;
+
+	if (reset)
+		set_ps_display("");
+	else
+	{
+		char		activitymsg[128];
+
+		snprintf(activitymsg, sizeof(activitymsg), "performing %s%s%s",
+ (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "",
+ (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
+ restartpoint ? "restartpoint" : "checkpoint");
+		set_ps_display(activitymsg);
+	}
+}
+
+
 /*
  * Perform a checkpoint --- either during shutdown, or on-the-fly
  *
@@ -8905,6 +8938,9 @@ CreateCheckPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, false);
 
+	/* Update the process title */
+	update_checkpoint_display(flags, false, false);
+
 	TRACE_POSTGRESQL_CHECKPOINT_START(flags);
 
 	/*
@@ -9120,6 +9156,9 @@ CreateCheckPoint(int flags)
 	/* Real work is done, but log and update stats before releasing lock. */
 	LogCheckpointEnd(false);
 
+	/* Reset the process title */
+	update_checkpoint_display(flags, false, true);
+
 	TRACE_POSTGRESQL_CHECKPOINT_DONE(CheckpointStats.ckpt_bufs_written,
 	 NBuffers,
 	 CheckpointStats.ckpt_segs_added,
@@ -9374,6 +9413,9 @@ CreateRestartPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, true);
 
+	/* Update the process title */
+	update_checkpoint_display(flags, true, false);
+
 	CheckPointGuts(lastCheckPoint.redo, flags);
 
 	/*
@@ -9492,6 +9534,9 @@ CreateRestartPoint(int flags)
 	/* Real work is done, but log and update before releasing lock. */
 	LogCheckpointEnd(true);
 
+	/* Reset the process title */
+	update_checkpoint_display(flags, true, true);
+
 	xtime = GetLatestXTime();
 	ereport((log_checkpoints ? LOG : DEBUG2),
 			(errmsg("recovery restart point at %X/%X",
-- 
2.29.2



signature.asc
Description: PGP signature

Re: pg_basebackup caused FailedAssertion

2020-12-11 Thread Jeff Davis

Old thread: 
https://www.postgresql.org/message-id/512E427B.9090308%40vmware.com
about commit 3a9e64aa.

On Wed, 2013-02-27 at 19:29 +0200, Heikki Linnakangas wrote:
> Right. I fixed that by adding WL_SOCKET_READABLE, and handling any 
> messages that might arrive after the frontend already sent CopyEnd.
> The 
> frontend shouldn't send any messages after CopyEnd, until it receives
> a 
> CopyEnd from the backend.

It looks like 4bad60e3 may have fixed the problem, is it possible to
just revert 3a9e64aa and allow the case?

Also, the comment added by 3a9e64aa is misleading, because waiting for
a CopyDone from the server is not enough. It's possible that the client
receives the CopyDone from the server and the client sends a new query
before the server breaks from the loop. The client needs to wait until
at least the first CommandComplete.

> In theory, the frontend could already send the next query before 
> receiving the CopyEnd, but libpq doesn't currently allow that. Until 
> someone writes a client that actually tries to do that, I'm not going
> to 
> try to support that in the backend. It would be a lot more work, and 
> likely be broken anyway, without any way to test it.

I tried to add streaming replication support (still a work in progress)
to the rust client[1], and I ran into this problem.

The core of the rust client is fully pipelined and async, so it's a bit
annoying to work around this problem.

Regards,
Jeff Davis

[1] https://github.com/sfackler/rust-postgres/

Re: Insert Documentation - Returning Clause and Order

2020-12-11 Thread David G. Johnston

On Fri, Dec 11, 2020 at 6:24 AM Ashutosh Bapat 
wrote:

> On Thu, Dec 10, 2020 at 7:49 PM David G. Johnston
>  wrote:
>
> > Yeah, the ongoing work on parallel inserts would seem to be an issue.
> We should probably document that though.  And maybe as part of parallel
> inserts patch provide a user-specifiable way to ask for such a guarantee if
> needed.  ‘Insert returning ordered”
>
> I am curious about the usecase which needs that guarantee? Don't you
> have a column on which you can ORDER BY so that it returns the same
> order as INSERT?
>

This comes up periodically in the context of auto-generated keys being
returned - specifically on the JDBC project list (maybe elsewhere...).  If
one adds 15 VALUES entries to an insert and then sends them in bulk to the
server it would be helpful if the generated keys could be matched up
one-to-one with the keyless objects in the client.  Basically "pipelining"
the client and server.

David J.

anonymous block returning like a function

2020-12-11 Thread PegoraroF10

I would like to have an anonymous block, like DO, but having resuts, like an
usual function does.

I know any user can do ...

create function pg_temp.run_time_bigger(numeric,numeric) returns numeric
language plpgsql as $$ 
begin if $1 > $2 then return $1; else return $2; end if; end;$$;
select * from pg_temp.run_time_bigger(5,3);
drop function pg_temp.run_time_bigger(numeric,numeric);

but would be better if he could ...
execute block(numeric,numeric) returns numeric language plpgsql as $$ 
begin if $1 > $2 then return $1; else return $2; end if; end;$$ 
USING(5,3); 

That USING would be params, but if it complicates it could be easily be
replaced by real values because that block is entirely created in run time,
so its optional.

What do you think about ? 
What part of postgres code do I have to carefully understand to write
something to do that ?



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-12-11 Thread Alvaro Herrera

By the way--  What did you think of the idea of explictly marking the
types used for bitmasks using types bits32 and friends, instead of plain
int, which is harder to spot?

Re: PG vs LLVM 12 on seawasp, next round

2020-12-11 Thread Fabien COELHO




Hello Andres,


I hadn't checked that before, but for the last few days there's been a
different failure than the one I saw earlier:

+ERROR:  could not load library 
"/home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so":
 libLLVMOrcJIT.so.12git: cannot open shared object file: No such file or directory

whereas what I fixed is about:

+ERROR:  could not load library 
"/home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so":
 
/home/fabien/pg/build-farm-11/buildroot/HEAD/pgsql.build/tmp_install/home/fabien/pg/build-farm-11/buildroot/HEAD/inst/lib/postgresql/llvmjit.so:
 undefined symbol: LLVMInitializeX86Target

Changed somewhere between
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp&dt=2020-11-20%2009%3A17%3A10
and
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp&dt=2020-11-21%2023%3A17%3A11

The "no such file" error seems more like a machine local issue to me.


I'll look into it when I have time, which make take some time. Hopefully 
over the holidays.


--
Fabien.

Re: pg_waldump error message fix

2020-12-11 Thread Bossart, Nathan

On 12/10/20, 9:23 PM, "Michael Paquier"  wrote:
> Please note that this is documented in xlogreader.h.  Hmm.  I can see
> your point here, still I think that what both of you are suggesting
> is not completely correct.  For example, switching to EndRecPtr would
> cause DecodeXLogRecord() to report an error with the end of the
> current record but it makes more sense to point to ReadRecPtr in this
> case.  On the other hand, it would make sense to report the beginning 
> of the position we are working on when validating the header if an
> error happens at this point.  So it seems to me that EndRecPtr or
> ReadRecPtr are not completely correct, and that we may need an extra
> LSN position to tell on which LSN we are working on instead that gets
> updated before the validation part, because we update ReadRecPtr and
> EndRecPtr only after this initial validation is done.

I looked through all the calls to report_invalid_record() in
xlogreader.c and noticed that all but a few in
XLogReaderValidatePageHeader() already report an LSN.  Of the calls in
XLogReaderValidatePageHeader() that don't report an LSN, it looks like
most still report a position, and the remaining ones are for "WAL file
is from different database system...," which IIUC generally happens on
the first page of the segment.

Perhaps we could simply omit the LSN in the pg_waldump message.

Nathan

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2020-12-11 Thread Fujii Masao





On 2020/12/12 3:01, Bharath Rupireddy wrote:

On Fri, Dec 11, 2020 at 11:01 PM Fujii Masao
 wrote:

On 2020/12/11 19:16, Bharath Rupireddy wrote:

On Thu, Dec 10, 2020 at 7:14 AM Bharath Rupireddy
 wrote:

+   /* We only look for active and open remote connections. 
*/
+   if (entry->invalidated || !entry->conn)
+   continue;

We should return even invalidated entry because it has still cached connection?



I checked this point earlier, for invalidated connections, the tuple
returned from the cache is also invalid and the following error will
be thrown. So, we can not get the server name for that user mapping.
Cache entries too would have been invalidated after the connection is
marked as invalid in pgfdw_inval_callback().

umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));
if (!HeapTupleIsValid(umaptup))
elog(ERROR, "cache lookup failed for user mapping with OID %u",
entry->key);



I further checked on returning invalidated connections in the output
of the function. Actually, the reason I'm seeing a null tuple from sys
cache (and hence the error "cache lookup failed for user mapping with
OID ") for an invalidated connection is that the user mapping
(with OID entry->key that exists in the cache) is getting dropped, so
the sys cache returns null tuple. The use case is as follows:

1) Create a server, role, and user mapping of the role with the server
2) Run a foreign table query, so that the connection related to the
server gets cached
3) Issue DROP OWNED BY for the role created, since the user mapping is
dependent on that role, it gets dropped from the catalogue table and
an invalidation message will be pending to clear the sys cache
associated with that user mapping.
4) Now, if I do select * from postgres_fdw_get_connections() or for
that matter any query, at the beginning the txn
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback()
gets called and marks the cached entry as invalidated. Remember the
reason for this invalidation message is that the user mapping with the
OID entry->key is dropped from 3). Later in
postgres_fdw_get_connections(), when we search the sys cache with
entry->key for that invalidated connection, since the user mapping is
dropped from the system, null tuple is returned.


Thanks for the analysis! This means that the cached connection invalidated by 
drop of server or user mapping will not be closed even by the subsequent access 
to the foreign server and will remain until the backend exits. Right?


It will be first marked as invalidated via
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback(),
and on the next use of that connection invalidated connections are
disconnected and reconnected.


I was thinking that in the case of drop of user mapping or server, 
hash_search(ConnnectionHash) in GetConnection() cannot find the cached connection entry 
invalidated by that drop. Because "user->umid" used as hash key is changed. So 
I was thinking that that invalidated connection will not be closed nor reconnected.




 if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0)
 {
 elog(DEBUG3, "closing connection %p for option changes to take effect",
  entry->conn);
 disconnect_pg_server(entry);
 }


If so, this seems like a connection-leak bug, at least for me Thought?



It's not a leak. The comment before pgfdw_inval_callback() [1]
explains why we can not immediately close/disconnect the connections
in pgfdw_inval_callback() after marking them as invalidated.


*If* invalidated connection cannot be close immediately even in the case of 
drop of server or user mapping, we can defer it to the subsequent call to 
GetConnection(). That is, GetConnection() closes not only the target 
invalidated connection but also the other all invalidated connections. Of 
course, invalidated connections will remain until subsequent GetConnection() is 
called, though.




Here is the scenario how in the midst of a txn we get invalidation
messages(AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback()
happens):

1) select from foreign table with server1, usermapping1 in session1
2) begin a top txn in session1, run a few foreign queries that open up
sub txns internally. meanwhile alter/drop server1/usermapping1 in
session2, then at each start of sub txn also we get to process the
invalidation messages via
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback().
So, if we disconnect right after marking invalidated in
pgfdw_inval_callback, that's a problem since we are in a sub txn under
a top txn.


Maybe. But what is the actual problem here?

OTOH, if cached connection should not be close in the middle of transaction, 
postgres_fdw_disconnect() also should be disallowed to be executed during 
transaction?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research

Re: Rethinking plpgsql's assignment implementation

Chapman Flack  writes:
> On 12/11/20 12:21, Tom Lane wrote:
>> solution I adopted was just to invent a new CoercionContext value
>> COERCION_PLPGSQL, representing "use pl/pgsql's rules".  (Basically
>> what that means nowadays is to apply CoerceViaIO if assignment cast
>> lookup doesn't find a cast pathway.)

> That seems like a rule that might be of use in other PLs or extensions;
> could it have a more generic name, COERCION_FALLBACK or something?

I'm not wedded to that name, but I doubt that it's semantics that we
really want to encourage anyone else to use.  In particular, the fact
that it's not a superset of COERCION_EXPLICIT is pretty darn weird,
with little except backwards compatibility to recommend it.

>> is now quoted, but the "SET n:" bit in front of it might confuse people,
>> especially if we don't document this new syntax (which I'm inclined not
>> to, since it's useless in straight SQL).

> If it's true that the only choices for n: are 1: or 2:, maybe it would look
> less confusing in an error message to, hmm, decree that this specialized SET
> form /always/ takes a two-component name, but accept something special like
> ROUTINE.x (or UNNAMED.x or NULL.x or something) for the case where there
> isn't a qualifying label in the plpgsql source?

As the patch stands, it's still using the RECFIELD code paths, which
means that there could be three-component target variable names
(label.variable.field).  If we were to get rid of that and expect
top-level field assignment to also be handled by this new mechanism,
then maybe your idea could be made to work.  But I have not tried to
implement that here, as I don't see how to make it work for RECORD-type
variables (where the names and types of the fields aren't determinate).

In any case, that approach still involves inserting some query text
that the user didn't write, so I'm not sure how much confusion it'd
remove.  The "SET n:" business at least looks like it's some weird
prefix comparable to "LINE n:", so while people wouldn't understand
it I think they'd easily see it as something the system prefixed
to their query.

Looking a bit ahead, it's not too hard to imagine plpgsql wishing
to pass other sorts of annotations through SPI and down to the core
parser.  Maybe we should think about a more general way to do that
in an out-of-band, not-visible-in-the-query-text fashion.

regards, tom lane

Re: Rethinking plpgsql's assignment implementation

2020-12-11 Thread Pavel Stehule

Hi

It is great. I expected much more work.

pá 11. 12. 2020 v 18:21 odesílatel Tom Lane  napsal:

> We've had complaints in the past about how plpgsql can't handle
> assignments to fields in arrays of records [1], that is cases like
>
> arrayvar[n].field := something;
>
> and also complaints about how plpgsql can't handle assignments
> to array slices [2], ie
>
> arrayvar[m:n] := something;
>
> As of commit c7aba7c14, we have another problem, namely that
> plpgsql's subscripted assignment only works for regular arrays;
> it won't work for other types that might define subscript
> assignment handlers.
>
> So I started to think about how to fix that, and eventually
> decided that what we ought to do is nuke plpgsql's array-assignment
> code altogether.  The core code already has support for everything
> we want here in the context of field/element assignments in UPDATE
> commands; if we could get plpgsql to make use of that infrastructure
> instead of rolling its own, we'd be a lot better off.
>
> The hard part of that is that the core parser will only generate
> the structures we need (FieldStores and assignment SubscriptingRefs)
> within UPDATE commands.  We could export the relevant functions
> (particularly transformAssignmentIndirection); but that won't help
> plpgsql very much, because it really wants to be able to run all this
> stuff through SPI.  That means we have to have SQL syntax that can
> generate an expression of that form.
>
> That led me to think about introducing a new statement, say
>

> SET variable_name opt_indirection := a_expr
>

SQL/PSM (ANSI SQL) defines SET var = expr

If you introduce a new statement - LET, then it can be less confusing for
users, and this statement can be the foundation for schema variables. With
this statement the implementation of schema variables is significantly
simpler.

Regards

Pavel



>
> where opt_indirection is gram.y's symbol for "field selections and/or
> subscripts".  The idea here is that a plpgsql statement like
>
> x[2].fld := something;
>
> would be parsed using this new statement, producing an expression
> that uses an assignment SubscriptingRef and a FieldStore operating
> on a Param that gives the initial value of the array-of-composite
> variable "x".  Then plpgsql would just evaluate this expression and
> assign the result to x.  Problem solved.
>
> This almost works as-is, modulo annoying parse conflicts against the
> existing variants of SET.  However there's a nasty little detail
> about what "variable_name" can be in plpgsql: it can be either one or
> two identifiers, since there might be a block label involved, eg
>
> <> declare x int; begin mylabel.x := ...
>
> Between that and the parse-conflict problem, I ended up
> with this syntax:
>
> SET n: variable_name opt_indirection := a_expr
>
> where "n" is an integer literal indicating how many dot-separated names
> should be taken as the base variable name.  Another annoying point is
> that plpgsql historically has allowed fun stuff like
>
> mycount := count(*) from my_table where ...;
>
> that is, after the expression you can have all the rest of an ordinary
> SELECT command.  That's not terribly hard to deal with, but it means
> that this new statement has to have all of SELECT's other options too.
>
> The other area that doesn't quite work without some kind of hack is
> that plpgsql's casting rules for which types can be assigned to what
> are far laxer than what the core parser thinks should be allowed in
> UPDATE.  The cast has to happen within the assignment expression
> for this to work at all, so plpgsql can't fix it by itself.  The
> solution I adopted was just to invent a new CoercionContext value
> COERCION_PLPGSQL, representing "use pl/pgsql's rules".  (Basically
> what that means nowadays is to apply CoerceViaIO if assignment cast
> lookup doesn't find a cast pathway.)
>
> A happy side-effect of this approach is that it actually makes
> some cases faster.  In particular I can measure speedups for
> (a) assignments to subscripted variables and (b) cases where a
> coercion must be performed to produce the result to be assigned.
> I believe the reason for this is that the patch effectively
> merges what had been separate expressions (subscripts or casts,
> respectively) into the main result-producing expression.  This
> eliminates a nontrivial amount of overhead for plancache validity
> checking, execution startup, etc.
>
> Another side-effect is that the report of the statement in error
> cases might look different.  For example, in v13 a typo in a
> subscript expression produces
>
> regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
> ERROR:  operator does not exist: ! integer
> LINE 1: SELECT !2
>^
> HINT:  No operator matches the given name and argument type. You might
> need to add an explicit type cast.
> QUERY:  SELECT !2
> CONTEXT:  PL/pgSQL function inline_code_block line 1 at assignmen

Re: Proposed patch for key managment

2020-12-11 Thread Bruce Momjian

On Fri, Dec 11, 2020 at 01:01:14PM -0500, Bruce Momjian wrote:
> On Wed, Dec  9, 2020 at 08:40:50PM -0500, Bruce Momjian wrote:
> > My next task is to write a script for Yubikey authentication.
> 
> I know Craig Ringer wanted Yubikey support, which allows two-factor
> authentication, so I have added it to the most recent patch by adding a
> cluster_passphrase_command %d/directory parameter:
> 
>   https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> 
> You can also store the PIN in a file, so you don't need a PIN to be
> entered by the user for each server start.

Here is the output without requiring a PIN;  attached is the script used:

++ initdb -K 256 -R -c '/u/postgres/tmp/pass_yubi_nopin.sh "%d"'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... engine "pkcs11" set.

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.
engine "pkcs11" set.
ok
performing post-bootstrap initialization ... engine "pkcs11" set.
ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start

++ pg_ctl -R -l /u/pg/server.log start
waiting for server to start... done
server started
++ pg_altercpass -R '/u/postgres/tmp/pass_yubi_nopin.sh "%d"' 
'/u/postgres/tmp/pass_yubi_nopin.sh "%d"'
engine "pkcs11" set.
engine "pkcs11" set.

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.
engine "pkcs11" set.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

pass_yubi_nopin.sh
Description: Bourne shell script

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-12-11 Thread Peter Eisentraut


On 2020-12-05 02:30, Michael Paquier wrote:

On Fri, Dec 04, 2020 at 04:28:26PM -0300, Alvaro Herrera wrote:

FWIW I'm with Peter on this.


Okay, attached is a patch to adjust the enums for the set of utility
commands that is the set of things I have touched lately.  Should that
be extended more?  I have not done that as a lot of those structures
exist as such for a long time.


I think this patch is good.

I have in the meantime committed a similar patch for cleaning up this 
issue in pg_dump.

Re: Rethinking plpgsql's assignment implementation

2020-12-11 Thread Chapman Flack

On 12/11/20 12:21, Tom Lane wrote:
> solution I adopted was just to invent a new CoercionContext value
> COERCION_PLPGSQL, representing "use pl/pgsql's rules".  (Basically
> what that means nowadays is to apply CoerceViaIO if assignment cast
> lookup doesn't find a cast pathway.)

That seems like a rule that might be of use in other PLs or extensions;
could it have a more generic name, COERCION_FALLBACK or something?

> is now quoted, but the "SET n:" bit in front of it might confuse people,
> especially if we don't document this new syntax (which I'm inclined not
> to, since it's useless in straight SQL).

If it's true that the only choices for n: are 1: or 2:, maybe it would look
less confusing in an error message to, hmm, decree that this specialized SET
form /always/ takes a two-component name, but accept something special like
ROUTINE.x (or UNNAMED.x or NULL.x or something) for the case where there
isn't a qualifying label in the plpgsql source?

It's still a strange arbitrary creation, but might give more of a hint of
its meaning if it crops up in an error message somewhere.

Regards,
-Chap

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

On Fri, Dec 11, 2020 at 3:46 PM Bharath Rupireddy
 wrote:
> If we were to show invalidated connections in the output of
> postgres_fdw_get_connections(), we can ignore the entry and continue
> further if the user mapping sys cache search returns null tuple:
>
> umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));
>
> if (!HeapTupleIsValid(umaptup))
>continue;

Any thoughts here?

> > > Also this makes me wonder if we should return both the server name and 
> > > boolean flag indicating whether it's invalidated or not. If so, users can 
> > > easily find the invalidated connection entry and disconnect it because 
> > > there is no need to keep invalidated connection.
> > >
> >
> > Currently we are returning a list of foreing server names with whom
> > there exist active connections. If we somehow address the above
> > mentioned problem for invalid connections and choose to show them as
> > well, then how should our output look like? Is it something like we
> > prepare a list of pairs (servername, validflag)?
>
> If agreed on above point, we can output something like: (myserver1,
> valid), (myserver2, valid), (myserver3, invalid), (myserver4, valid)

And here on the output text?

In case we agreed on the above output format, one funniest thing could
occur is that if some hypothetical person has "valid" or "invalid" as
their foreign server names, they will have difficulty in reading their
output. (valid, valid), (valid, invalid), (invalid, valid), (invalid,
invalid).

Or should it be something like pairs of (server_name, true/false)?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_basebackup test coverage

Robert Haas  writes:
> Well, either this or Noah's method has the disadvantage that not
> everyone will get the benefit of the tests, and that those who wish to
> get that benefit must install more stuff. But, all three of the
> computers I have within arm's reach (yeah, I might have a problem)
> have Archive::Tar installed, so maybe it's not a big concern in
> practice.

FWIW, it looks to me like Archive::Tar is part of the standard Perl
installation on both RHEL and macOS, so it's probably pretty common.

> I am slightly inclined to think that the perl package
> approach might be better than shell commands because perhaps it is
> more likely to work on Windows, but I'm not positive.

Yeah, that makes sense to me too.

regards, tom lane

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

On Fri, Dec 11, 2020 at 11:01 PM Fujii Masao
 wrote:
> On 2020/12/11 19:16, Bharath Rupireddy wrote:
> > On Thu, Dec 10, 2020 at 7:14 AM Bharath Rupireddy
> >  wrote:
> >>> +   /* We only look for active and open remote 
> >>> connections. */
> >>> +   if (entry->invalidated || !entry->conn)
> >>> +   continue;
> >>>
> >>> We should return even invalidated entry because it has still cached 
> >>> connection?
> >>>
> >>
> >> I checked this point earlier, for invalidated connections, the tuple
> >> returned from the cache is also invalid and the following error will
> >> be thrown. So, we can not get the server name for that user mapping.
> >> Cache entries too would have been invalidated after the connection is
> >> marked as invalid in pgfdw_inval_callback().
> >>
> >> umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));
> >> if (!HeapTupleIsValid(umaptup))
> >>elog(ERROR, "cache lookup failed for user mapping with OID %u",
> >> entry->key);
> >>
> >
> > I further checked on returning invalidated connections in the output
> > of the function. Actually, the reason I'm seeing a null tuple from sys
> > cache (and hence the error "cache lookup failed for user mapping with
> > OID ") for an invalidated connection is that the user mapping
> > (with OID entry->key that exists in the cache) is getting dropped, so
> > the sys cache returns null tuple. The use case is as follows:
> >
> > 1) Create a server, role, and user mapping of the role with the server
> > 2) Run a foreign table query, so that the connection related to the
> > server gets cached
> > 3) Issue DROP OWNED BY for the role created, since the user mapping is
> > dependent on that role, it gets dropped from the catalogue table and
> > an invalidation message will be pending to clear the sys cache
> > associated with that user mapping.
> > 4) Now, if I do select * from postgres_fdw_get_connections() or for
> > that matter any query, at the beginning the txn
> > AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback()
> > gets called and marks the cached entry as invalidated. Remember the
> > reason for this invalidation message is that the user mapping with the
> > OID entry->key is dropped from 3). Later in
> > postgres_fdw_get_connections(), when we search the sys cache with
> > entry->key for that invalidated connection, since the user mapping is
> > dropped from the system, null tuple is returned.
>
> Thanks for the analysis! This means that the cached connection invalidated by 
> drop of server or user mapping will not be closed even by the subsequent 
> access to the foreign server and will remain until the backend exits. Right?

It will be first marked as invalidated via
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback(),
and on the next use of that connection invalidated connections are
disconnected and reconnected.

if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0)
{
elog(DEBUG3, "closing connection %p for option changes to take effect",
 entry->conn);
disconnect_pg_server(entry);
}

> If so, this seems like a connection-leak bug, at least for me Thought?
>

It's not a leak. The comment before pgfdw_inval_callback() [1]
explains why we can not immediately close/disconnect the connections
in pgfdw_inval_callback() after marking them as invalidated.

Here is the scenario how in the midst of a txn we get invalidation
messages(AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback()
happens):

1) select from foreign table with server1, usermapping1 in session1
2) begin a top txn in session1, run a few foreign queries that open up
sub txns internally. meanwhile alter/drop server1/usermapping1 in
session2, then at each start of sub txn also we get to process the
invalidation messages via
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback().
So, if we disconnect right after marking invalidated in
pgfdw_inval_callback, that's a problem since we are in a sub txn under
a top txn.

I don't think we can do something here and disconnect the connections
right after the invalidation happens. Thoughts?

[1]
/*
 * Connection invalidation callback function
 *
 * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
 * mark connections depending on that entry as needing to be remade.
 * We can't immediately destroy them, since they might be in the midst of
 * a transaction, but we'll remake them at the next opportunity.
 *
 * Although most cache invalidation callbacks blow away all the related stuff
 * regardless of the given hashvalue, connections are expensive enough that
 * it's worth trying to avoid that.
 *
 * NB: We could avoid unnecessary disconnection more strictly by examining
 * individual option values, but it seems too much effort for the gain.
 */
static void
pgfdw_inval_callback(Datum arg, int cac

Re: Proposed patch for key managment

2020-12-11 Thread Bruce Momjian

On Wed, Dec  9, 2020 at 08:40:50PM -0500, Bruce Momjian wrote:
> My next task is to write a script for Yubikey authentication.

I know Craig Ringer wanted Yubikey support, which allows two-factor
authentication, so I have added it to the most recent patch by adding a
cluster_passphrase_command %d/directory parameter:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

You can also store the PIN in a file, so you don't need a PIN to be
entered by the user for each server start.  Attached is the script I
with a PIN required.  Here is a session:

$ initdb -K 256 -R -c '/u/postgres/tmp/pass_yubi.sh %R "%d"'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ...
Enter Yubikey PIN:

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.

--> Enter Yubikey PIN:
ok
performing post-bootstrap initialization ...
--> Enter Yubikey PIN:
ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start


$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter Yubikey PIN:
 done
server started

It even allows for passphrase rotation using my pg_altercpass tool with
this patch:


https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff

The Yubikey-encrypted passphrase is stored in the key directory, so the
encrypted passphrase stays with the data/WAL keys during passphrase
rotation:

$ pg_altercpass -R '/u/postgres/tmp/pass_yubi.sh %R "%d"' 
'/u/postgres/tmp/pass_yubi.sh %R "%d"'

--> Enter Yubikey PIN:

--> Enter Yubikey PIN:

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.

--> Enter Yubikey PIN:

Yubikey PIN rotation has to be done using the Yubikey tool, and data/WAL
key rotation has to be done via switching to a standby, which hasn't
been implemented yet.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_yubi.sh
Description: Bourne shell script

Re: MultiXact\SLRU buffers configuration

2020-12-11 Thread Gilles Darold


Le 10/12/2020 à 15:45, Gilles Darold a écrit :

Le 08/12/2020 à 18:52, Andrey Borodin a écrit :

Hi Gilles!

Many thanks for your message!


8 дек. 2020 г., в 21:05, Gilles Darold  написал(а):

I know that this report is not really helpful

Quite contrary - this benchmarks prove that controllable reproduction exists. 
I've rebased patches for PG11. Can you please benchmark them (without extending 
SLRU)?

Best regards, Andrey Borodin.


Hi,


Running tests yesterday with the patches has reported log of failures 
with error on INSERT and UPDATE statements:



ERROR:  lock MultiXactOffsetControlLock is not held


After a patch review this morning I think I have found what's going 
wrong. In patch 
v6-0001-Use-shared-lock-in-GetMultiXactIdMembers-for-offs.patch I 
think there is a missing reinitialisation of the lockmode variable to 
LW_NONE inside the retry loop after the call to LWLockRelease() in 
src/backend/access/transam/multixact.c:1392:GetMultiXactIdMembers(). 
I've attached a new version of the patch for master that include the 
fix I'm using now with PG11 and with which everything works very well now.



I'm running more tests to see the impact on the performances to play 
with multixact_offsets_slru_buffers, multixact_members_slru_buffers 
and multixact_local_cache_entries. I will reports the results later today.




Hi,

Sorry for the delay, I have done some further tests to try to reach the 
limit without bottlenecks on multixact or shared buffers. The tests was 
done on a Microsoft Asure machine with 2TB of RAM and 4 sockets Intel 
Xeon Platinum 8280M (128 cpu). PG configuration:


    max_connections = 4096
    shared_buffers = 64GB
    max_prepared_transactions = 2048
    work_mem = 256MB
    maintenance_work_mem = 2GB
    wal_level = minimal
    synchronous_commit = off
    commit_delay = 1000
    commit_siblings = 10
    checkpoint_timeout = 1h
    max_wal_size = 32GB
    checkpoint_completion_target = 0.9

I have tested with several values for the different buffer's variables 
starting from:


    multixact_offsets_slru_buffers = 64
    multixact_members_slru_buffers = 128
    multixact_local_cache_entries = 256

to the values with the best performances we achieve with this test to 
avoid MultiXactOffsetControlLock or MultiXactMemberControlLock:


    multixact_offsets_slru_buffers = 128
    multixact_members_slru_buffers = 512
    multixact_local_cache_entries = 1024

Also shared_buffers have been increased up to 256GB to avoid 
buffer_mapping contention.


Our last best test reports the following wait events:

     event_type |   event    |    sum
    ++---
     Client | ClientRead | 321690211
     LWLock | buffer_content |   2970016
     IPC    | ProcArrayGroupUpdate   |   2317388
     LWLock | ProcArrayLock  |   1445828
     LWLock | WALWriteLock   |   1187606
     LWLock | SubtransControlLock    |    972889
     Lock   | transactionid  |    840560
     Lock   | relation   |    587600
     Activity   | LogicalLauncherMain    |    529599
     Activity   | AutoVacuumMain |    528097

At this stage I don't think we can have better performances by tuning 
these buffers at least with PG11.


About performances gain related to the patch for shared lock in 
GetMultiXactIdMembers unfortunately I can not see a difference with or 
without this patch, it could be related to our particular benchmark. But 
clearly the patch on multixact buffers should be committed as this is 
really helpfull to be able to tuned PG when multixact bottlenecks are found.



Best regards,

--
Gilles Darold
LzLabs GmbH
https://www.lzlabs.com/

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2020-12-11 Thread Fujii Masao





On 2020/12/10 10:44, Bharath Rupireddy wrote:

On Wed, Dec 9, 2020 at 4:49 PM Fujii Masao  wrote:

I started reviewing 0001 patch.



Thanks!


IMO the 0001 patch should be self-contained so that we can commit it at first. 
That is, I think that it's better to move the documents and tests for the 
functions 0001 patch adds from 0004 to 0001.



+1. I will make each patch self-contained in the next version which I
plan to submit soon.


Since 0001 introduces new user-visible functions into postgres_fdw, the version 
of postgres_fdw should be increased?



Yeah looks like we should do that, dblink has done that when it
introduced new functions. In case the new functions are not required
for anyone, they can choose to go back to 1.0.

Should we make the new version as 1.1 or 2.0? I prefer to make it 1.1
as we are just adding few functionality over 1.0. I will change the
default_version from 1.0 to the 1.1 and add a new
postgres_fdw--1.1.sql file.


+1




If okay, I will make changes to 0001 patch.


The similar code to get the server name from cached connection entry exists also in 
pgfdw_reject_incomplete_xact_state_change(). I'm tempted to make the "common" 
function for that code and use it both in postgres_fdw_get_connections() and 
pgfdw_reject_incomplete_xact_state_change(), to simplify the code.



+1. I will move the server name finding code to a new function, say
char *pgfdw_get_server_name(ConnCacheEntry *entry);


+   /* We only look for active and open remote connections. 
*/
+   if (entry->invalidated || !entry->conn)
+   continue;

We should return even invalidated entry because it has still cached connection?



I checked this point earlier, for invalidated connections, the tuple
returned from the cache is also invalid and the following error will
be thrown. So, we can not get the server name for that user mapping.
Cache entries too would have been invalidated after the connection is
marked as invalid in pgfdw_inval_callback().

umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));
if (!HeapTupleIsValid(umaptup))
   elog(ERROR, "cache lookup failed for user mapping with OID %u",
entry->key);

Can we reload the sys cache entries of USERMAPPINGOID (if there is a
way) for invalid connections in our new function and then do a look
up? If not, another way could be storing the associated server name or
oid in the ConnCacheEntry. Currently we store user mapping oid(in
key), its hash value(in mapping_hashvalue) and foreign server oid's
hash value (in server_hashvalue). If we have the foreign server oid,
then we can just look up for the server name, but I'm not quite sure
whether we get the same issue i.e. invalid tuples when the entry gets
invalided (via pgfdw_inval_callback) due to some change in foreign
server options.

IMHO, we can simply choose to show all the active, valid connections. Thoughts?


Also this makes me wonder if we should return both the server name and boolean 
flag indicating whether it's invalidated or not. If so, users can easily find 
the invalidated connection entry and disconnect it because there is no need to 
keep invalidated connection.



Currently we are returning a list of foreing server names with whom
there exist active connections. If we somehow address the above
mentioned problem for invalid connections and choose to show them as
well, then how should our output look like? Is it something like we
prepare a list of pairs (servername, validflag)?


+   if (all)
+   {
+   hash_destroy(ConnectionHash);
+   ConnectionHash = NULL;
+   result = true;
+   }

Could you tell me why ConnectionHash needs to be destroyed?



Say, in a session there are hundreds of different foreign server
connections made and if users want to disconnect all of them with the
new function and don't want any further foreign connections in that
session, they can do it. But then why keep the cache just lying around
and holding those many entries? Instead we can destroy the cache and
if necessary it will be allocated later on next foreign server
connections.

IMHO, it is better to destroy the cache in case of disconnect all,
hoping to save memory, thinking that (next time if required) the cache
allocation doesn't take much time. Thoughts?


Ok, but why is ConnectionHash destroyed only when "all" is true? Even when 
"all" is false, for example, the following query can disconnect all the cached 
connections. Even in this case, i.e., whenever there are no cached connections, ConnectionHash 
should be destroyed?

SELECT postgres_fdw_disconnect(srvname) FROM pg_foreign_server ;

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2020-12-11 Thread Fujii Masao





On 2020/12/11 19:16, Bharath Rupireddy wrote:

On Thu, Dec 10, 2020 at 7:14 AM Bharath Rupireddy
 wrote:

+   /* We only look for active and open remote connections. 
*/
+   if (entry->invalidated || !entry->conn)
+   continue;

We should return even invalidated entry because it has still cached connection?



I checked this point earlier, for invalidated connections, the tuple
returned from the cache is also invalid and the following error will
be thrown. So, we can not get the server name for that user mapping.
Cache entries too would have been invalidated after the connection is
marked as invalid in pgfdw_inval_callback().

umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));
if (!HeapTupleIsValid(umaptup))
   elog(ERROR, "cache lookup failed for user mapping with OID %u",
entry->key);



I further checked on returning invalidated connections in the output
of the function. Actually, the reason I'm seeing a null tuple from sys
cache (and hence the error "cache lookup failed for user mapping with
OID ") for an invalidated connection is that the user mapping
(with OID entry->key that exists in the cache) is getting dropped, so
the sys cache returns null tuple. The use case is as follows:

1) Create a server, role, and user mapping of the role with the server
2) Run a foreign table query, so that the connection related to the
server gets cached
3) Issue DROP OWNED BY for the role created, since the user mapping is
dependent on that role, it gets dropped from the catalogue table and
an invalidation message will be pending to clear the sys cache
associated with that user mapping.
4) Now, if I do select * from postgres_fdw_get_connections() or for
that matter any query, at the beginning the txn
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback()
gets called and marks the cached entry as invalidated. Remember the
reason for this invalidation message is that the user mapping with the
OID entry->key is dropped from 3). Later in
postgres_fdw_get_connections(), when we search the sys cache with
entry->key for that invalidated connection, since the user mapping is
dropped from the system, null tuple is returned.


Thanks for the analysis! This means that the cached connection invalidated by 
drop of server or user mapping will not be closed even by the subsequent access 
to the foreign server and will remain until the backend exits. Right? If so, 
this seems like a connection-leak bug, at least for me Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: pg_basebackup test coverage

2020-12-11 Thread Robert Haas

On Fri, Dec 11, 2020 at 3:04 AM Michael Paquier  wrote:
> Why don't you just use Archive::Tar [1] instead of looking for some system
> commands?  Combining list_files() with extract(), it is possible to
> extract an archive, with or without compression, without hoping for an
> equivalent to exist on Windows.  That would not be the first time
> either that there is a TAP test that skips some tests if a module does
> not exist.  See for example what psql does with IO::Pty.

Well, either this or Noah's method has the disadvantage that not
everyone will get the benefit of the tests, and that those who wish to
get that benefit must install more stuff. But, all three of the
computers I have within arm's reach (yeah, I might have a problem)
have Archive::Tar installed, so maybe it's not a big concern in
practice. I am slightly inclined to think that the perl package
approach might be better than shell commands because perhaps it is
more likely to work on Windows, but I'm not positive.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Clean up ancient test style

Peter Eisentraut  writes:
> Many older tests where written in a style like
>  SELECT '' AS two, i.* FROM INT2_TBL ...

> where the first column indicated the number of expected result rows.
> This has gotten increasingly out of date, as the test data fixtures
> have expanded, so a lot of these don't match anymore and are misleading. 
>   Moreover, this style isn't really necessary, since the psql output 
> already shows the number of result rows.  (Perhaps this was different at 
> some point?)

> I'm proposing to clean all this up by removing all those extra columns.

+1 for concept ... I didn't bother to check the patch in detail.

regards, tom lane

Rethinking plpgsql's assignment implementation

We've had complaints in the past about how plpgsql can't handle
assignments to fields in arrays of records [1], that is cases like

arrayvar[n].field := something;

and also complaints about how plpgsql can't handle assignments
to array slices [2], ie

arrayvar[m:n] := something;

As of commit c7aba7c14, we have another problem, namely that
plpgsql's subscripted assignment only works for regular arrays;
it won't work for other types that might define subscript
assignment handlers.

So I started to think about how to fix that, and eventually
decided that what we ought to do is nuke plpgsql's array-assignment
code altogether.  The core code already has support for everything
we want here in the context of field/element assignments in UPDATE
commands; if we could get plpgsql to make use of that infrastructure
instead of rolling its own, we'd be a lot better off.

The hard part of that is that the core parser will only generate
the structures we need (FieldStores and assignment SubscriptingRefs)
within UPDATE commands.  We could export the relevant functions
(particularly transformAssignmentIndirection); but that won't help
plpgsql very much, because it really wants to be able to run all this
stuff through SPI.  That means we have to have SQL syntax that can
generate an expression of that form.

That led me to think about introducing a new statement, say

SET variable_name opt_indirection := a_expr

where opt_indirection is gram.y's symbol for "field selections and/or
subscripts".  The idea here is that a plpgsql statement like

x[2].fld := something;

would be parsed using this new statement, producing an expression
that uses an assignment SubscriptingRef and a FieldStore operating
on a Param that gives the initial value of the array-of-composite
variable "x".  Then plpgsql would just evaluate this expression and
assign the result to x.  Problem solved.

This almost works as-is, modulo annoying parse conflicts against the
existing variants of SET.  However there's a nasty little detail
about what "variable_name" can be in plpgsql: it can be either one or
two identifiers, since there might be a block label involved, eg

<> declare x int; begin mylabel.x := ...

Between that and the parse-conflict problem, I ended up
with this syntax:

SET n: variable_name opt_indirection := a_expr

where "n" is an integer literal indicating how many dot-separated names
should be taken as the base variable name.  Another annoying point is
that plpgsql historically has allowed fun stuff like

mycount := count(*) from my_table where ...;

that is, after the expression you can have all the rest of an ordinary
SELECT command.  That's not terribly hard to deal with, but it means
that this new statement has to have all of SELECT's other options too.

The other area that doesn't quite work without some kind of hack is
that plpgsql's casting rules for which types can be assigned to what
are far laxer than what the core parser thinks should be allowed in
UPDATE.  The cast has to happen within the assignment expression
for this to work at all, so plpgsql can't fix it by itself.  The
solution I adopted was just to invent a new CoercionContext value
COERCION_PLPGSQL, representing "use pl/pgsql's rules".  (Basically
what that means nowadays is to apply CoerceViaIO if assignment cast
lookup doesn't find a cast pathway.)

A happy side-effect of this approach is that it actually makes
some cases faster.  In particular I can measure speedups for
(a) assignments to subscripted variables and (b) cases where a
coercion must be performed to produce the result to be assigned.
I believe the reason for this is that the patch effectively
merges what had been separate expressions (subscripts or casts,
respectively) into the main result-producing expression.  This
eliminates a nontrivial amount of overhead for plancache validity
checking, execution startup, etc.

Another side-effect is that the report of the statement in error
cases might look different.  For example, in v13 a typo in a
subscript expression produces

regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
ERROR:  operator does not exist: ! integer
LINE 1: SELECT !2
   ^
HINT:  No operator matches the given name and argument type. You might need to 
add an explicit type cast.
QUERY:  SELECT !2
CONTEXT:  PL/pgSQL function inline_code_block line 1 at assignment

With this patch, you get

regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
ERROR:  operator does not exist: ! integer
LINE 1: SET 1: x[!2] = 43
 ^
HINT:  No operator matches the given name and argument type. You might need to 
add an explicit type cast.
QUERY:  SET 1: x[!2] = 43
CONTEXT:  PL/pgSQL function inline_code_block line 1 at assignment

It seems like a clear improvement to me that the whole plpgsql statement
is now quoted, but the "SET n:" bit in front of it might confuse people,
especially if we don't document this new sy

Clean up ancient test style

2020-12-11 Thread Peter Eisentraut


Many older tests where written in a style like

SELECT '' AS two, i.* FROM INT2_TBL ...

where the first column indicated the number of expected result rows.
This has gotten increasingly out of date, as the test data fixtures
have expanded, so a lot of these don't match anymore and are misleading. 
 Moreover, this style isn't really necessary, since the psql output 
already shows the number of result rows.  (Perhaps this was different at 
some point?)


I'm proposing to clean all this up by removing all those extra columns.

The patch is very big, so I'm attaching a compressed version.  You can 
also see a diff oneline: 
https://github.com/postgres/postgres/compare/master...petere:test-cleanup


0001-Clean-up-ancient-test-style.patch.gz
Description: GNU Zip compressed data

Re: On login trigger: take three

2020-12-11 Thread Pavel Stehule

pá 11. 12. 2020 v 17:05 odesílatel Konstantin Knizhnik <
k.knizh...@postgrespro.ru> napsal:

>
>
> On 11.12.2020 18:40, Pavel Stehule wrote:
>
>
> is not correct. It makes it not possible to superuser to disable triggers
>> for all users.
>>
>
> pg_database_ownercheck returns true for superuser always.
>
>
> Sorry, but I consider different case: when normal user is connected to the
> database.
> In this case pg_database_ownercheck returns false and trigger is not
> disabled, isn't it?
>

My idea was to reduce necessary rights to database owners.  But you have a
true, so only superuser can create event trigger, so this feature cannot be
used in DBaaS environments, and then my original idea was wrong.


>
> Also GUCs are not associated with any database. So I do not understand
>> why  this check of database ownership is relevant at all?
>>
>> What kind of protection violation we want to prevent?
>>
>> It seems to be obvious that normal user should not be able to prevent
>> trigger execution because this triggers may be used to enforce some
>> security policies.
>> If trigger was created by user itself, then it can drop or disable it
>> using ALTER statement. GUC is not needed for it.
>>
>
> when you cannot connect to the database, then you cannot do ALTER. In
> DBaaS environments lot of users has not superuser rights.
>
>
>
> But only superusers can set login triggers, right?
> So only superuser can make a mistake in this trigger. But he have enough
> rights to recover this error. Normal users are not able to define on
> connection triggers and
> should not have rights to disable them.
>

yes, it is true

Pavel


> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>

Re: Feature Proposal: Add ssltermination parameter for SNI-based LoadBalancing

2020-12-11 Thread Lukas Meisegeier


Hey Heikki,

thanks for providing feedback :)
The traffic between proxy and psql-server is unencrypted thats why I
don't need to patch the server.
I tried returning a fixed response on the first plain SSLRequest
forwarding it to a psql-server with ssl enabled an tried to switch then
on the ssl connection startup but that didn't work out. I guess its
because the psql-server won't accept an ssl connection if its not
requested via SSLRequest.
I would definitly appreciate if the psql-server could accept the
TLS-client hello directly but we would still need to set the
tls-sni-extension correctly.
Perhaps we could rename the parameter to "sslplainrequest(yes/no)" and
start with making the plain SSLRequest optional in the psql-server.

Best Regards
Lukas


Am 11-Dec-20 um 14:26 schrieb Heikki Linnakangas:

On 10/12/2020 17:49, Lukas Meisegeier wrote:

I try to host multiple postgresql-servers on the same ip and the same
port through SNI-based load-balancing.
Currently this is not possible because of two issues:
1. The psql client won't set the tls-sni-extension correctly
(https://www.postgresql.org/message-id/20181211145240.GL20222%40redhat.com)

2. The psql connection protocol implements a SSLRequest in plain text
before actually opening a connection.

The first issue is easily solvable by calling
`SSL_set_tlsext_host_name(conn->ssl,
conn->connhost[conn->whichhost].host)` before opening the connection.

The second issue is also solvable through a new parameter
"ssltermination" which if set to "proxy" will skip the initial
SSLRequest and connects directly through ssl.
The default value would be "server" which changes nothing on the
existing behaviour.


Don't you need backend changes as well? The backend will still expect
the client to send an SSLRequest. Or is the connection from the proxy to
the actual server unencrypted?

It's not very nice that the client needs to set special options,
depending on whether the server is behind a proxy or not. Could you
teach the proxy to deal with the SSLRequest message?

Perhaps we should teach the backend to accept a TLS ClientHello
directly, without the SSLRequest message. That way, the client could
send the ClientHello without SSLRequest, and the proxy wouldn't need to
care about SSLRequest. It would also eliminate one round-trip from the
protocol handshake, which would be nice. A long deprecation/transition
period would be needed before we could make that the default behavior,
but that's ok.

- Heikki

Re: On login trigger: take three

2020-12-11 Thread Konstantin Knizhnik




On 11.12.2020 18:40, Pavel Stehule wrote:


is not correct. It makes it not possible to superuser to disable
triggers for all users.


pg_database_ownercheck returns true for superuser always.


Sorry, but I consider different case: when normal user is connected to 
the database.
In this case pg_database_ownercheck returns false and trigger is not 
disabled, isn't it?




Also GUCs are not associated with any database. So I do not
understand why  this check of database ownership is relevant at all?

What kind of protection violation we want to prevent?

It seems to be obvious that normal user should not be able to
prevent trigger execution because this triggers may be used to
enforce some security policies.
If trigger was created by user itself, then it can drop or disable
it using ALTER statement. GUC is not needed for it.


when you cannot connect to the database, then you cannot do ALTER. In 
DBaaS environments lot of users has not superuser rights.



But only superusers can set login triggers, right?
So only superuser can make a mistake in this trigger. But he have enough 
rights to recover this error. Normal users are not able to define on 
connection triggers and

should not have rights to disable them.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Feature Proposal: Add ssltermination parameter for SNI-based LoadBalancing

2020-12-11 Thread Heikki Linnakangas


On 11/12/2020 16:46, Lukas Meisegeier wrote:

Hey Heikki,

thanks for providing feedback :)
The traffic between proxy and psql-server is unencrypted thats why I
don't need to patch the server.


Ok.


I tried returning a fixed response on the first plain SSLRequest
forwarding it to a psql-server with ssl enabled an tried to switch then
on the ssl connection startup but that didn't work out. I guess its
because the psql-server won't accept an ssl connection if its not
requested via SSLRequest.


Your proxy could receive the client's SSLRequest message, and respond 
with a single byte 'S'. You don't need to forward that to the real 
PostgreSQL server, since the connection to the PostgreSQL server is 
unencrypted. Then perform the TLS handshake, and forward all traffic to 
the real server only after that.


Client: -> SSLRequest
 Proxy: <- 'S'
Client: -> TLS ClientHello
 Proxy: [finish TLS handshake]

- Heikki

Re: On login trigger: take three

2020-12-11 Thread Pavel Stehule

pá 11. 12. 2020 v 16:07 odesílatel Konstantin Knizhnik <
k.knizh...@postgrespro.ru> napsal:

>
>
> On 09.12.2020 15:24, Pavel Stehule wrote:
>
>
>
> st 9. 12. 2020 v 13:17 odesílatel Greg Nancarrow 
> napsal:
>
>> On Tue, Dec 8, 2020 at 3:26 PM Pavel Stehule 
>> wrote:
>> >
>> >
>> > There are two maybe generic questions?
>> >
>> > 1. Maybe we can introduce more generic GUC for all event triggers like
>> disable_event_triggers? This GUC can be checked only by the database owner
>> or super user. It can be an alternative ALTER TABLE DISABLE TRIGGER ALL. It
>> can be protection against necessity to restart to single mode to repair the
>> event trigger. I think so more generic solution is better than special
>> disable_client_connection_trigger GUC.
>> >
>> > 2. I have no objection against client_connection. It is probably better
>> for the mentioned purpose - possibility to block connection to database.
>> Can be interesting, and I am not sure how much work it is to introduce the
>> second event - session_start. This event should be started after connecting
>> - so the exception there doesn't block connect, and should be started also
>> after the new statement "DISCARD SESSION", that will be started
>> automatically after DISCARD ALL.  This feature should not be implemented in
>> first step, but it can be a plan for support pooled connections
>> >
>>
>
> PGC_SU_BACKEND is too strong, there should be PGC_BACKEND if this option
> can be used by database owner
>
> Pavel
>
>
>> I've created a separate patch to address question (1), rather than
>> include it in the main patch, which I've adjusted accordingly. I'll
>> leave question (2) until another time, as you suggest.
>> See the attached patches.
>>
>> Regards,
>> Greg Nancarrow
>> Fujitsu Australia
>>
>
> It seems to me that current implementation of EventTriggersDisabled:
>
>
>  /*
>  * Indicates whether all event triggers are currently disabled
>  * for the current user.
>  * Event triggers are disabled when configuration parameter
>  * "disable_event_triggers" is true, and the current user
>  * is the database owner or has superuser privileges.
>  */
> static inline bool
> EventTriggersDisabled(void)
> {
> return (disable_event_triggers && pg_database_ownercheck(MyDatabaseId,
> GetUserId()));
> }
>
>
> is not correct. It makes it not possible to superuser to disable triggers
> for all users.
>

pg_database_ownercheck returns true for superuser always.

Also GUCs are not associated with any database. So I do not understand why
> this check of database ownership is relevant at all?
>
> What kind of protection violation we want to prevent?
>
> It seems to be obvious that normal user should not be able to prevent
> trigger execution because this triggers may be used to enforce some
> security policies.
> If trigger was created by user itself, then it can drop or disable it
> using ALTER statement. GUC is not needed for it.
>

when you cannot connect to the database, then you cannot do ALTER. In DBaaS
environments lot of users has not superuser rights.



> So I think that EventTriggersDisabled function is not needed and can be
> replaced just with disable_event_triggers GUC check.
> And this option should be defined with PGC_SU_BACKEND
>
>
>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>

Re: [HACKERS] [PATCH] Generic type subscripting

Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Wed, Dec 09, 2020 at 04:59:34PM -0500, Tom Lane wrote:
>> 0001 adds the ability to attach a subscript handler to an existing
>> data type with ALTER TYPE.  This is clearly going to be necessary
>> if we want extension types to be able to use this facility.  The
>> only thing that I think might be controversial here is that I did
>> not add the ability to set pg_type.typelem.

> I'm curious what could be the use case for setting pg_type.typelem for
> subscripting? I don't see this that much controversial, but maybe I'm
> missing something.

If you want the result of subscripting to be "text" or some other built-in
type, then clearly there's no need to use typelem for that, you can just
refer to the standard OID macros.  The potential use-case that I thought
of for setting typelem is where an extension defines types A and B and
would like subscripting of B to yield A.  Installing A's OID as B.typelem
would save a catalog lookup during subscript parsing, and remove a bunch
of edge failure cases such as what happens if A gets renamed.  However,
given the dependency behavior, this would also have the effect of "you
can't drop A without dropping B, and you can't modify A in any interesting
way either".  That would be annoyingly restrictive if there weren't any
actual physical containment relationship.  But on the other hand, maybe
it's acceptable and we just need to document it.

The other issue is what about existing stored SubscriptingRef structs.
If our backs were to the wall I'd think about removing the refelemtype
field so there's no stored image of typelem that needs to be updated.
But that would incur an extra catalog lookup in array_exec_setup, so
I don't much like it.  If we do add the ability to set typelem, I'd
prefer to just warn people to not change it once they've installed a
subscript handler.

Anyway, between those two issues I'm about -0.1 on adding a way to alter
typelem.  I won't fight hard if somebody wants it, but I'm inclined
to leave it out.

>> +1 using subscripts for hstore is nice idea

> Yeah, I also find it's a good suggestion, the implementation seems fine
> as well. As a side note, I'm surprised hstore doesn't have any
> functionality to update values, except hstore_concat.

Yeah.  I cribbed the subscript-fetch implementation from hstore_fetchval,
but was surprised to find that there wasn't any direct equivalent function
for subscript-store.  I guess people have gotten by with concat, but
it's not exactly an obvious way to do things.

regards, tom lane

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-11 Thread Dmitry Dolgov

> On Wed, Dec 09, 2020 at 04:59:34PM -0500, Tom Lane wrote:
>
> 0001 adds the ability to attach a subscript handler to an existing
> data type with ALTER TYPE.  This is clearly going to be necessary
> if we want extension types to be able to use this facility.  The
> only thing that I think might be controversial here is that I did
> not add the ability to set pg_type.typelem.  While that'd be easy
> enough so far as ALTER TYPE is concerned, I'm not sure that we want
> to encourage people to change it.  The dependency rules mean that
> the semantics of typelem aren't something you really want to change
> after-the-fact on an existing type.  Also, if we did allow it, any
> existing SubscriptingRef.refelemtype values in stored views would
> fail to be updated.

I'm curious what could be the use case for setting pg_type.typelem for
subscripting? I don't see this that much controversial, but maybe I'm
missing something.

> On Thu, Dec 10, 2020 at 05:37:20AM +0100, Pavel Stehule wrote:
> st 9. 12. 2020 v 22:59 odesílatel Tom Lane  napsal:
>
> > 0002 makes use of that to support subscripting of hstore.  I'm not
> > sure how much we care about that from a functionality standpoint,
> > but it seems like it might be good to have a contrib module testing
> > that extensions can use this.  Also, I thought possibly an example
> > showing what's basically the minimum possible amount of complexity
> > would be good to have.  If people like this, I'll finish it up (it
> > lacks docs) and add it.
> >
>
> +1 using subscripts for hstore is nice idea

Yeah, I also find it's a good suggestion, the implementation seems fine
as well. As a side note, I'm surprised hstore doesn't have any
functionality to update values, except hstore_concat.

Re: On login trigger: take three

2020-12-11 Thread Konstantin Knizhnik

On 09.12.2020 15:24, Pavel Stehule wrote:

st 9. 12. 2020 v 13:17 odesílatel Greg Nancarrow > napsal:

On Tue, Dec 8, 2020 at 3:26 PM Pavel Stehule
mailto:pavel.steh...@gmail.com>> wrote:
>
>
> There are two maybe generic questions?
>
> 1. Maybe we can introduce more generic GUC for all event
triggers like disable_event_triggers? This GUC can be checked only
by the database owner or super user. It can be an alternative
ALTER TABLE DISABLE TRIGGER ALL. It can be protection against
necessity to restart to single mode to repair the event trigger. I
think so more generic solution is better than special
disable_client_connection_trigger GUC.
>
> 2. I have no objection against client_connection. It is probably
better for the mentioned purpose - possibility to block connection
to database. Can be interesting, and I am not sure how much work
it is to introduce the second event - session_start. This event
should be started after connecting - so the exception there
doesn't block connect, and should be started also after the new
statement "DISCARD SESSION", that will be started automatically
after DISCARD ALL.  This feature should not be implemented in
first step, but it can be a plan for support pooled connections
>

PGC_SU_BACKEND is too strong, there should be PGC_BACKEND if this 
option can be used by database owner

Pavel

I've created a separate patch to address question (1), rather than
include it in the main patch, which I've adjusted accordingly. I'll
leave question (2) until another time, as you suggest.
See the attached patches.

Regards,
Greg Nancarrow
Fujitsu Australia

It seems to me that current implementation of EventTriggersDisabled:

 /*
 * Indicates whether all event triggers are currently disabled
 * for the current user.
 * Event triggers are disabled when configuration parameter
 * "disable_event_triggers" is true, and the current user
 * is the database owner or has superuser privileges.
 */
static inline bool
EventTriggersDisabled(void)
{
    return (disable_event_triggers && 
pg_database_ownercheck(MyDatabaseId, GetUserId()));

}

is not correct. It makes it not possible to superuser to disable 
triggers for all users.
Also GUCs are not associated with any database. So I do not understand 
why  this check of database ownership is relevant at all?

What kind of protection violation we want to prevent?

It seems to be obvious that normal user should not be able to prevent 
trigger execution because this triggers may be used to enforce some 
security policies.
If trigger was created by user itself, then it can drop or disable it 
using ALTER statement. GUC is not needed for it.

So I think that EventTriggersDisabled function is not needed and can be 
replaced just with disable_event_triggers GUC check.

And this option should be defined with PGC_SU_BACKEND

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: On login trigger: take three

2020-12-11 Thread Konstantin Knizhnik

On 10.12.2020 21:09, Pavel Stehule wrote:

čt 10. 12. 2020 v 16:48 odesílatel Konstantin Knizhnik
mailto:k.knizh...@postgrespro.ru>> napsal:

On 10.12.2020 18:12, Pavel Stehule wrote:

My idea was a little bit different. Inside postinit initialize
some global variables with info if there are event triggers or
not. And later you can use this variable to start transactions
and other things.

There will be two access to pg_event_trigger, but it can
eliminate useless and probably more expensive start_transaction
and end_transaction.

Do you mean some variable in shared memory or GUCs?
It was my first idea - to use some flag in shared memory to make
it possible fast check that there are not event triggers.
But this flag should be sent per database. May be I missed
something, but there is no any per-database shared memory data
structure in Postgres.
Certainly it is possible to organize some hash db->event_info, but
it makes this patch several times more complex.

My idea was a little bit different - just inside process
initialization, checking existence of event triggers, and later when
it is necessary to start a transaction for trigger execution. This
should to reduce useless empty transaction,

I am sure this is a problem on computers with slower CPU s, although I
have I7, but because this feature is usually unused, then the
performance impact should be minimal every time.

Sorry, may be I missed something. But now I completely confused with
your idea.

Right now processing of login hooks is done in PostgresMain.
In the previous mail you have suggested to did it in InitPostgres which
is invoked from PostgresMain.

So what is the difference (except there open transaction in InitPostgres)?

So what the problem we are trying to solve now?
As far as I understand, your concern is about extra overhead during
backend startup needed to check if there on-login triggers defined.
We are not speaking about trigger execution. We mostly worry about
applications which are not using triggers at all. But still have to pay
some small extra overhead at startup.
This overhead seems to be negligible (1% for dummy connection doing just
"select 1"). But taken in account that 99.% applications will not
use connection triggers,

even very small overhead seems to be not good.

But what can we do?
We do not want to scan pg_event_trigger table on backend startup. But
how else we can skip this check, taken in account that

1. Such trigger can be registered at any moment of time
2. Triggers are registered per database, so we can not have just global
flag,signaling lack of event triggers.

The only solution I see at this moment is hash
in shared memory.

But it seems to be overkill from my point of view.

This is why I suggested to have disable_login_event_triggers GUC set to
true by default.

From my point of view it is better to have separate GUC disabling
just client connection events and switch it on by default.
So only those who need this events with switch it on, other users
will not pay any extra cost for it.

It can work, but this design is not user friendly. The significant
bottleneck should be forking new processes, and check of content some
usually very small tables should be invisible. So if there is a
possible way to implement some feature that can be enabled by default,
then we should go this way. It can be pretty unfriendly if somebody
writes a logon trigger and it will not work by default without any
warning.

Please notice that event triggers are disabled by default.
You need to call "alter event trigger xxx enable always".
May be instead of GUCs we should somehow use ALTER mechanism?
But I do not understand why it is better and how it will help to solve
this problem (elimination of exrta overhead when there are not triggers).

When I tested last patch I found a problem (I have disabled assertions
due performance testing)

create role xx login;
alter system set disable_event_triggers to on; -- better should be
positive form - enable_event_triggers to on, off

select pg_reload_conf();

psql -U xx

Thread 2.1 "postmaster" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f2bff438ec0 (LWP 827497)]
ResourceOwnerEnlargeCatCacheRefs (owner=0x0) at resowner.c:1025
1025 ResourceArrayEnlarge(&(owner->catrefarr));
(gdb) bt
#0 ResourceOwnerEnlargeCatCacheRefs (owner=0x0) at resowner.c:1025
#1 0x008a70f8 in SearchCatCacheInternal (cache=out>, nkeys=nkeys@entry=1, v1=v1@entry=13836, v2=v2@entry=0,

v3=v3@entry=0, v4=v4@entry=0) at catcache.c:1273
#2 0x008a7575 in SearchCatCache1 (cache=,
v1=v1@entry=13836) at catcache.c:1167
#3 0x008b7f80 in SearchSysCache1 (cacheId=cacheId@entry=21,
key1=key1@entry=13836) at syscache.c:1122
#4 0x005939cd in pg_database_ownercheck (db_oid=13836,
roleid=16387) at aclchk.c:5114

#5 0x00605b42 in Ev

Re: New Table Access Methods for Multi and Single Inserts

On Tue, Dec 8, 2020 at 6:27 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
> Hi,
>
> Currently, for any component (such as COPY, CTAS[1], CREATE/REFRESH
> Mat View[1], INSERT INTO SELECTs[2]) multi insert logic such as buffer
> slots allocation, maintenance, decision to flush and clean up, need to
> be implemented outside the table_multi_insert() API. The main problem
> is that it fails to take into consideration the underlying storage
> engine capabilities, for more details of this point refer to a
> discussion in multi inserts in CTAS thread[1]. This also creates a lot
> of duplicate code which is more error prone and not maintainable.
>
> More importantly, in another thread [3] @Andres Freund suggested to
> have table insert APIs in such a way that they look more like 'scan'
> APIs i.e. insert_begin, insert, insert_end. The main advantages doing
> this are(quoting from his statement in [3]) - "more importantly it'd
> allow an AM to optimize operations across multiple inserts, which is
> important for column stores."
>
> I propose to introduce new table access methods for both multi and
> single inserts based on the prototype suggested by Andres in [3]. Main
> design goal of these new APIs is to give flexibility to tableam
> developers in implementing multi insert logic dependent on the
> underlying storage engine.
>
> Below are the APIs. I suggest to have a look at
> v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch for details
> of the new data structure and the API functionality. Note that
> temporarily I used XX_v2, we can change it later.
>
> TableInsertState* table_insert_begin(initial_args);
> void table_insert_v2(TableInsertState *state, TupleTableSlot *slot);
> void table_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot);
> void table_multi_insert_flush(TableInsertState *state);
> void table_insert_end(TableInsertState *state);
>
> I'm attaching a few patches(just to show that these APIs work, avoids
> a lot of duplicate code and makes life easier). Better commenting can
> be added later. If these APIs and patches look okay, we can even
> consider replacing them in other places such as nodeModifyTable.c and
> so on.
>
> v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch --->
> introduces new table access methods for multi and single inserts. Also
> implements/rearranges the outside code for heap am into these new
> APIs.
> v1-0002-CTAS-and-REFRESH-Mat-View-With-New-Multi-Insert-Table-AM.patch
> ---> adds new multi insert table access methods to CREATE TABLE AS,
> CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW.
> v1-0003-ATRewriteTable-With-New-Single-Insert-Table-AM.patch ---> adds
> new single insert table access method to ALTER TABLE rewrite table
> code.
> v1-0004-COPY-With-New-Multi-and-Single-Insert-Table-AM.patch ---> adds
> new single and multi insert table access method to COPY code.
>
> Thoughts?
>
> [1] -
https://www.postgresql.org/message-id/4eee0730-f6ec-e72d-3477-561643f4b327%40swarm64.com
> [2] -
https://www.postgresql.org/message-id/20201124020020.GK24052%40telsasoft.com
> [3] -
https://www.postgresql.org/message-id/20200924024128.kyk3r5g7dnu3fxxx%40alap3.anarazel.de

Added this to commitfest to get it reviewed further.

https://commitfest.postgresql.org/31/2871/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Feature Proposal: Add ssltermination parameter for SNI-based LoadBalancing

2020-12-11 Thread Heikki Linnakangas


On 10/12/2020 17:49, Lukas Meisegeier wrote:

I try to host multiple postgresql-servers on the same ip and the same
port through SNI-based load-balancing.
Currently this is not possible because of two issues:
1. The psql client won't set the tls-sni-extension correctly
(https://www.postgresql.org/message-id/20181211145240.GL20222%40redhat.com)
2. The psql connection protocol implements a SSLRequest in plain text
before actually opening a connection.

The first issue is easily solvable by calling
`SSL_set_tlsext_host_name(conn->ssl,
conn->connhost[conn->whichhost].host)` before opening the connection.

The second issue is also solvable through a new parameter
"ssltermination" which if set to "proxy" will skip the initial
SSLRequest and connects directly through ssl.
The default value would be "server" which changes nothing on the
existing behaviour.


Don't you need backend changes as well? The backend will still expect 
the client to send an SSLRequest. Or is the connection from the proxy to 
the actual server unencrypted?


It's not very nice that the client needs to set special options, 
depending on whether the server is behind a proxy or not. Could you 
teach the proxy to deal with the SSLRequest message?


Perhaps we should teach the backend to accept a TLS ClientHello 
directly, without the SSLRequest message. That way, the client could 
send the ClientHello without SSLRequest, and the proxy wouldn't need to 
care about SSLRequest. It would also eliminate one round-trip from the 
protocol handshake, which would be nice. A long deprecation/transition 
period would be needed before we could make that the default behavior, 
but that's ok.


- Heikki

Re: Insert Documentation - Returning Clause and Order

2020-12-11 Thread Ashutosh Bapat

On Thu, Dec 10, 2020 at 7:49 PM David G. Johnston
 wrote:
>
> On Thursday, December 10, 2020, Ashutosh Bapat  
> wrote:
>>
>> On Wed, Dec 9, 2020 at 9:10 PM David G. Johnston
>>  wrote:
>> >
>> > Hey,
>> >
>> > Would it be accurate to add the following sentence to the INSERT 
>> > documentation under "Outputs"?
>> >
>> > "...inserted or updated by the command."  For a multiple-values insertion, 
>> > the order of output rows will match the order that rows are presented in 
>> > the values or query clause.
>>
>> Postgres's current implementation may be doing so, but I don't think
>> that can be guaranteed in possible implementations. I don't think
>> restricting choice of implementation to guarantee that is a good idea
>> either.
>>
>
> Yeah, the ongoing work on parallel inserts would seem to be an issue.  We 
> should probably document that though.  And maybe as part of parallel inserts 
> patch provide a user-specifiable way to ask for such a guarantee if needed.  
> ‘Insert returning ordered”

I am curious about the usecase which needs that guarantee? Don't you
have a column on which you can ORDER BY so that it returns the same
order as INSERT?

-- 
Best Wishes,
Ashutosh Bapat

Re: PoC/WIP: Extended statistics on expressions

2020-12-11 Thread Dean Rasheed

On Tue, 8 Dec 2020 at 12:44, Tomas Vondra  wrote:
>
> Possibly. But I don't think it's worth the extra complexity. I don't
> expect people to have a lot of overlapping stats, so the amount of
> wasted space and CPU time is expected to be fairly limited.
>
> So I don't think it's worth spending too much time on this now. Let's
> just do what you proposed, and revisit this later if needed.
>

Yes, I think that's a reasonable approach to take. As long as the
documentation makes it clear that building MCV stats also causes
standard expression stats to be built on any expressions included in
the list, then the user will know and can avoid duplication most of
the time. I don't think there's any need for code to try to prevent
that -- just as we don't bother with code to prevent a user building
multiple indexes on the same column.

The only case where duplication won't be avoidable is where there are
multiple MCV stats sharing the same expression, but that's probably
quite unlikely in practice, and it seems acceptable to leave improving
that as a possible future optimisation.

Regards,
Dean

Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

On Thu, Dec 10, 2020 at 7:14 AM Bharath Rupireddy
 wrote:
> > +   /* We only look for active and open remote 
> > connections. */
> > +   if (entry->invalidated || !entry->conn)
> > +   continue;
> >
> > We should return even invalidated entry because it has still cached 
> > connection?
> >
>
> I checked this point earlier, for invalidated connections, the tuple
> returned from the cache is also invalid and the following error will
> be thrown. So, we can not get the server name for that user mapping.
> Cache entries too would have been invalidated after the connection is
> marked as invalid in pgfdw_inval_callback().
>
> umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));
> if (!HeapTupleIsValid(umaptup))
>   elog(ERROR, "cache lookup failed for user mapping with OID %u",
> entry->key);
>

I further checked on returning invalidated connections in the output
of the function. Actually, the reason I'm seeing a null tuple from sys
cache (and hence the error "cache lookup failed for user mapping with
OID ") for an invalidated connection is that the user mapping
(with OID entry->key that exists in the cache) is getting dropped, so
the sys cache returns null tuple. The use case is as follows:

1) Create a server, role, and user mapping of the role with the server
2) Run a foreign table query, so that the connection related to the
server gets cached
3) Issue DROP OWNED BY for the role created, since the user mapping is
dependent on that role, it gets dropped from the catalogue table and
an invalidation message will be pending to clear the sys cache
associated with that user mapping.
4) Now, if I do select * from postgres_fdw_get_connections() or for
that matter any query, at the beginning the txn
AtStart_Cache()-->AcceptInvalidationMessages()-->pgfdw_inval_callback()
gets called and marks the cached entry as invalidated. Remember the
reason for this invalidation message is that the user mapping with the
OID entry->key is dropped from 3). Later in
postgres_fdw_get_connections(), when we search the sys cache with
entry->key for that invalidated connection, since the user mapping is
dropped from the system, null tuple is returned.

If we were to show invalidated connections in the output of
postgres_fdw_get_connections(), we can ignore the entry and continue
further if the user mapping sys cache search returns null tuple:

umaptup = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(entry->key));

if (!HeapTupleIsValid(umaptup))
   continue;

Thoughts?

> > Also this makes me wonder if we should return both the server name and 
> > boolean flag indicating whether it's invalidated or not. If so, users can 
> > easily find the invalidated connection entry and disconnect it because 
> > there is no need to keep invalidated connection.
> >
>
> Currently we are returning a list of foreing server names with whom
> there exist active connections. If we somehow address the above
> mentioned problem for invalid connections and choose to show them as
> well, then how should our output look like? Is it something like we
> prepare a list of pairs (servername, validflag)?

If agreed on above point, we can output something like: (myserver1,
valid), (myserver2, valid), (myserver3, invalid), (myserver4, valid)

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs

On Fri, Dec 11, 2020 at 1:40 PM Michael Paquier  wrote:
> On Fri, Dec 11, 2020 at 12:48:49PM +0530, Bharath Rupireddy wrote:
> > I'm not quite sure how other databases behave. If I go by the main
> > intention of EXPLAIN without ANALYZE, that should do the planning,
> > show it in the output and no execution of the query should happen. For
> > EXPLAIN CTAS/CMV, only thing that gets planned is the SELECT part and
> > no execution happens so no existence check for the CTAS/CMV relation
> > that will get created if the CTAS/CMV is executed. Having said that,
> > the existence of the relations that are in the SELECT part are anyways
> > checked during planning for EXPLAIN without ANALYZE.
>
> I think that it is tricky to define IF NOT EXISTS for a CTAS with
> EXPLAIN. How would you for example treat an EXPLAIN ANALYZE with a
> query that includes an INSERT RETURNING in a WITH clause.  Would you
> say that we do nothing if the relation exists?  Or would you execute
> it, still insert nothing on the result relation because it already
> exists, even if the inner query may have inserted something as part of
> its execution on a different relation?

I may not have got your above scenario correctly(it will be good if
you can provide the use case in case I want to check something there).
I tried the following way, all the involved relations are being
checked for existence even though for EXPLAIN:
postgres=#   EXPLAIN WITH temp1 AS (SELECT * FROM t1) INSERT INTO
t1_does_not_exit VALUES (1);
ERROR:  relation "t1_does_not_exit" does not exist
LINE 1: ...LAIN WITH temp1 AS (SELECT * FROM t1) INSERT INTO t1_does_no...
 ^
IIUC, is it that we want the following behaviour in case the relation
CTAS/CMV is trying to create does not exist? Note that the sample
queries are run on latest master branch:

EXPLAIN: throw an error, instead of the query showing select plan on
master branch currently?
postgres=# explain create table t2 as select * from t1;
 QUERY PLAN

 Seq Scan on t1  (cost=0.00..2.00 rows=100 width=8)

EXPLAIN ANALYZE: throw an error as it does on master branch?
postgres=# explain analyze create table t2 as select * from t1;
ERROR:  relation "t2" already exists

EXPLAIN with if-not-exists clause: throw a warning and an empty plan
from ExplainOneUtility? If not an empty plan, we should be doing the
relation existence check before we come to explain routines, maybe in
gram.c? On the master branch it doesn't happen, the query shows the
plan for select part as shown below.
postgres=# explain create table if not exists t2 as select * from t1;
 QUERY PLAN

 Seq Scan on t1  (cost=0.00..2.00 rows=100 width=8)

EXPLAIN ANALYZE with if-not-exists clause: (ideally, for if-not-exists
clause we expect a warning to be issued, but currently relation
existence error is thrown) a warning and an empty plan from
ExplainOneUtility? If not an empty plan, we should be doing the
relation existence check before we come to explain routines, maybe in
gram.c? On the master branch an ERROR is thrown.
postgres=# explain analyze create table if not exists t2 as select * from t1;
ERROR:  relation "t2" already exists

For plain CTAS -> throw an error as it happens on master branch.
postgres=# create table t2 as select * from t1;
ERROR:  relation "t2" already exists

For plain CTAS with if-not-exists clause -> a warning is issued as it
happens on master branch.
postgres=# create table if not exists t2 as select * from t1;
NOTICE:  relation "t2" already exists, skipping
CREATE TABLE AS

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_shmem_allocations & documentation

2020-12-11 Thread Benoit Lobréau

Would "NULL for anonymous allocations, since details related to them are
not known." be ok ?


Le ven. 11 déc. 2020 à 09:29, Kyotaro Horiguchi  a
écrit :

> At Fri, 11 Dec 2020 14:42:45 +0900, Michael Paquier 
> wrote in
> > On Fri, Dec 11, 2020 at 11:00:58AM +0900, Kyotaro Horiguchi wrote:
> > > Although we could just rip some words off, I'd like to propose instead
> > > to add an explanation why it is not exposed for anonymous allocations,
> > > like the column allocated_size.
> >
> > Indeed, there is a hiccup between what the code does and what the docs
> > tell: the offset is not NULL for unused memory.
> >
> > > -   The offset at which the allocation starts. NULL for anonymous
> > > -   allocations and unused memory.
> > > +   The offset at which the allocation starts. For anonymous
> allocations,
> > > +   no information about individual allocations is available, so
> the column
> > > +   will be NULL in that case.
> >
> > I'd say: let's be simple and just remove "and unused memory" because
> > anonymous allocations are...  Anonymous so you cannot know details
> > related to them.  That's something easy to reason about, and the docs
> > were written originally to remain simple.
>
> Hmm. I don't object to that.  Howerver, isn't the description for
> allocated_size too verbose in that sense?
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

Re: pg_waldump error message fix

2020-12-11 Thread Kyotaro Horiguchi

At Fri, 11 Dec 2020 17:19:33 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Fri, 11 Dec 2020 14:21:57 +0900, Michael Paquier  
> wrote in 
> > On Fri, Dec 11, 2020 at 01:30:16PM +0900, Kyotaro Horiguchi wrote:
> > > currRecPtr is a private member of the struct (see the definition of
> > > the struct XLogReaderState).  We should instead use EndRecPtr outside
> > > xlog reader.
> > 
> > Please note that this is documented in xlogreader.h.  Hmm.  I can see
> > your point here, still I think that what both of you are suggesting
> > is not completely correct.  For example, switching to EndRecPtr would
> 
> EndRecPtr is defined as it points to the LSN to start reading the next
> record.  It donsn't move if XLogReadRecord failed to read the
> record. I think this is documented in a comment somewhere.  It can
> point to the beginning of a page so "error in WAL record at  start>" is a kind of bogus but that is not the point here.
> 
> > cause DecodeXLogRecord() to report an error with the end of the
> > current record but it makes more sense to point to ReadRecPtr in this
> 
> DecodeXLogRecord() handles a record alread successflly read. So
> ReadRecPtr is pointing to the beginning of the given record at the
> timex.  pg_waldump:main() and ReadRecrod (or the context of
> DecodeXLogRecord()) are in different contexts.  The place in question
> in pg_waldump seems to be a result of a thinko that it can use
> ReadRecPtr regardless of whether XLogReadRecrod successfully read a
> record or not.
> 
> > case.  On the other hand, it would make sense to report the beginning 
> > of the position we are working on when validating the header if an
> > error happens at this point.  So it seems to me that EndRecPtr or
> > ReadRecPtr are not completely correct, and that we may need an extra
> > LSN position to tell on which LSN we are working on instead that gets
> > updated before the validation part, because we update ReadRecPtr and
> > EndRecPtr only after this initial validation is done.
> 
> So we cannot use the ErrorRecPtr since pg_waldump:main() shoud show
> the LSN XLogReadRecord() found a invalid record and DecodeXLogRecord()
> should show the LSN XLogReadRecord() found a valid record.

Wait! That's wrong.  Yeah, we can add ErrorRecPtr to point the error
record regardless of the context.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pg_shmem_allocations & documentation

2020-12-11 Thread Kyotaro Horiguchi

At Fri, 11 Dec 2020 14:42:45 +0900, Michael Paquier  wrote 
in 
> On Fri, Dec 11, 2020 at 11:00:58AM +0900, Kyotaro Horiguchi wrote:
> > Although we could just rip some words off, I'd like to propose instead
> > to add an explanation why it is not exposed for anonymous allocations,
> > like the column allocated_size.
> 
> Indeed, there is a hiccup between what the code does and what the docs
> tell: the offset is not NULL for unused memory.
> 
> > -   The offset at which the allocation starts. NULL for anonymous
> > -   allocations and unused memory.
> > +   The offset at which the allocation starts. For anonymous 
> > allocations,
> > +   no information about individual allocations is available, so the 
> > column
> > +   will be NULL in that case.
> 
> I'd say: let's be simple and just remove "and unused memory" because
> anonymous allocations are...  Anonymous so you cannot know details
> related to them.  That's something easy to reason about, and the docs
> were written originally to remain simple.

Hmm. I don't object to that.  Howerver, isn't the description for
allocated_size too verbose in that sense?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pg_waldump error message fix

2020-12-11 Thread Kyotaro Horiguchi

At Fri, 11 Dec 2020 14:21:57 +0900, Michael Paquier  wrote 
in 
> On Fri, Dec 11, 2020 at 01:30:16PM +0900, Kyotaro Horiguchi wrote:
> > currRecPtr is a private member of the struct (see the definition of
> > the struct XLogReaderState).  We should instead use EndRecPtr outside
> > xlog reader.
> 
> Please note that this is documented in xlogreader.h.  Hmm.  I can see
> your point here, still I think that what both of you are suggesting
> is not completely correct.  For example, switching to EndRecPtr would

EndRecPtr is defined as it points to the LSN to start reading the next
record.  It donsn't move if XLogReadRecord failed to read the
record. I think this is documented in a comment somewhere.  It can
point to the beginning of a page so "error in WAL record at " is a kind of bogus but that is not the point here.

> cause DecodeXLogRecord() to report an error with the end of the
> current record but it makes more sense to point to ReadRecPtr in this

DecodeXLogRecord() handles a record alread successflly read. So
ReadRecPtr is pointing to the beginning of the given record at the
timex.  pg_waldump:main() and ReadRecrod (or the context of
DecodeXLogRecord()) are in different contexts.  The place in question
in pg_waldump seems to be a result of a thinko that it can use
ReadRecPtr regardless of whether XLogReadRecrod successfully read a
record or not.

> case.  On the other hand, it would make sense to report the beginning 
> of the position we are working on when validating the header if an
> error happens at this point.  So it seems to me that EndRecPtr or
> ReadRecPtr are not completely correct, and that we may need an extra
> LSN position to tell on which LSN we are working on instead that gets
> updated before the validation part, because we update ReadRecPtr and
> EndRecPtr only after this initial validation is done.

So we cannot use the ErrorRecPtr since pg_waldump:main() shoud show
the LSN XLogReadRecord() found a invalid record and DecodeXLogRecord()
should show the LSN XLogReadRecord() found a valid record.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs

On Fri, Dec 11, 2020 at 12:48:49PM +0530, Bharath Rupireddy wrote:
> I'm not quite sure how other databases behave. If I go by the main
> intention of EXPLAIN without ANALYZE, that should do the planning,
> show it in the output and no execution of the query should happen. For
> EXPLAIN CTAS/CMV, only thing that gets planned is the SELECT part and
> no execution happens so no existence check for the CTAS/CMV relation
> that will get created if the CTAS/CMV is executed. Having said that,
> the existence of the relations that are in the SELECT part are anyways
> checked during planning for EXPLAIN without ANALYZE.

I think that it is tricky to define IF NOT EXISTS for a CTAS with
EXPLAIN.  How would you for example treat an EXPLAIN ANALYZE with a
query that includes an INSERT RETURNING in a WITH clause.  Would you
say that we do nothing if the relation exists?  Or would you execute
it, still insert nothing on the result relation because it already
exists, even if the inner query may have inserted something as part of
its execution on a different relation?
--
Michael


signature.asc
Description: PGP signature

Re: pg_basebackup test coverage