Re: Support prepared statement invalidation when result types change

2024-04-03 Thread Jelte Fennema-Nio
On Sun, 7 Jan 2024 at 07:55, vignesh C  wrote:
> One of the test has aborted in CFBot at [1] with:

Rebased the patchset and removed patch 0003 since it was causing the
CI issue reported by vignesh and it seems much less useful and more
controversial to me anyway (due to the extra required roundtrip).


On Sun, 7 Jan 2024 at 09:17, Shay Rojansky  wrote:
> Just to point out, FWIW, that the .NET Npgsql driver does indeed cache 
> RowDescriptions... The whole point of preparation is to optimize things as 
> much as possible for repeated execution of the query; I get that the value 
> there is much lower than e.g. doing another network roundtrip, but that's 
> still extra work that's better off being cut if it can be.

Hmm, interesting. I totally agree that it's always good to do less
when possible. The problem is that in the face of server side prepared
statement invalidations due to DDL changes to the table or search path
changes, the row types might change. Or the server needs to constantly
throw an error, like it does now, but that seems worse imho.

I'm wondering though if we can create a middleground, where a client
can still cache the RowDescription client side when no DDL or
search_patch changes are happening. But still tell the client about a
new RowDescription when they do happen.

The only way of doing that I can think of is changing the Postgres
protocol in a way similar to this: Allow Execute to return a
RowDescription too, but have the server only do so when the previously
received RowDescription for this prepared statement is now invalid.

This would definitely require some additional tracking at PgBouncer to
make it work, i.e. per client and per server it should now keep track
of the last RowDescription for each prepared statement. But that's
definitely something we could do.

This would make this patch much more involved though, as now it would
suddenly involve an actual protocol change, and that basically depends
on this patch moving forward[1].

[1]: 
https://www.postgresql.org/message-id/flat/cageczqtg2hcmb5gau53uuwcdc7gcnjfll6mnw0wnhwhgq9u...@mail.gmail.com


v6-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch
Description: Binary data


v6-0001-Support-prepared-statement-invalidation-when-resu.patch
Description: Binary data


Re: Support prepared statement invalidation when result types change

2024-01-07 Thread Shay Rojansky
On Mon, Sep 18, 2023 at 1:31 PM Jelte Fennema-Nio  wrote:

> Furthermore caching RowDescription is also not super useful, most
> clients request it every time because it does not require an extra
> round trip, so there's almost no overhead in requesting it.

Just to point out, FWIW, that the .NET Npgsql driver does indeed cache
RowDescriptions... The whole point of preparation is to optimize things as
much as possible for repeated execution of the query; I get that the value
there is much lower than e.g. doing another network roundtrip, but that's
still extra work that's better off being cut if it can be.


Re: Support prepared statement invalidation when result types change

2024-01-06 Thread vignesh C
On Mon, 18 Sept 2023 at 18:01, Jelte Fennema-Nio  wrote:
>
> @Euler thanks for the review. I addressed the feedback.
>
> On Fri, 15 Sept 2023 at 01:41, Andy Fan  wrote:
> > What if a client has *cached* an old version of RowDescription
> > and the server changed it to something new and sent resultdata
> > with the new RowDescription.  Will the client still be able to work
> > expectly?
>
> It depends a bit on the exact change. For instance a column being
> added to the end of the resultdata shouldn't be a problem. And that is
> actually quite a common case for this issue:
> 1. PREPARE p as (SELECT * FROM t);
> 2. ALTER TABLE t ADD COLUMN ...
> 3. EXECUTE p
>
> But type changes of existing columns might cause issues when the
> RowDescription is cached. But such changes also cause issues now.
> Currently the prepared statement becomes unusable when this happens
> (returning errors every time). With this patch it's at least possible
> to have prepared statements continue working in many cases.
> Furthermore caching RowDescription is also not super useful, most
> clients request it every time because it does not require an extra
> round trip, so there's almost no overhead in requesting it.
>
> Clients caching ParameterDescription seems more useful because
> fetching the parameter types does require an extra round trip. So
> caching it could cause errors with 0003. But right now if the argument
> types need to change it gives an error every time when executing the
> prepared statement. So I believe 0003 is still an improvement over the
> status quo, because there are many cases where the client knows that
> the types might have changed and it thus needs to re-fetch the
> ParameterDescription: the most common case is changing the
> search_path. And there's also cases where even a cached
> ParamaterDescription will work fine: e.g. the type is changed but the
> encoding stays the same (e.g. drop + create an enum, or text/varchar,
> or the text encoding of int and bigint)

One of the test has aborted in CFBot at [1] with:
[05:26:16.214] Core was generated by `postgres: postgres
regression_pg_stat_statements [local] EXECUTE  '.
[05:26:16.214] Program terminated with signal SIGABRT, Aborted.
[05:26:16.214] #0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:50
[05:26:16.214] Download failed: Invalid argument.  Continuing without
source file ./signal/../sysdeps/unix/sysv/linux/raise.c.
[05:26:16.392]
[05:26:16.392] Thread 1 (Thread 0x7fbe1d997a40 (LWP 28738)):
[05:26:16.392] #0  __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:50


[05:26:36.911] #5  0x55c5aa523e71 in RevalidateCachedQuery
(plansource=0x55c5ac811cf0, queryEnv=queryEnv@entry=0x0) at
../src/backend/utils/cache/plancache.c:730
[05:26:36.911] num_params = 0
[05:26:36.911] param_types = 0x55c5ac860438
[05:26:36.911] snapshot_set = false
[05:26:36.911] rawtree = 0x55c5ac859f08
[05:26:36.911] tlist = 
[05:26:36.911] qlist = 
[05:26:36.911] resultDesc = 
[05:26:36.911] querytree_context = 
[05:26:36.911] oldcxt = 
[05:26:36.911] #6  0x55c5a9de0262 in ExplainExecuteQuery
(execstmt=0x55c5ac6d9d38, into=0x0, es=0x55c5ac859648,
queryString=0x55c5ac6d91e0 "EXPLAIN (VERBOSE, COSTS OFF) EXECUTE
st6;", params=0x0, queryEnv=0x0) at
../src/backend/commands/prepare.c:594
[05:26:36.911] entry = 0x55c5ac839ba8
[05:26:36.911] query_string = 
[05:26:36.911] cplan = 
[05:26:36.911] plan_list = 
[05:26:36.911] p = 
[05:26:36.911] paramLI = 0x0
[05:26:36.911] estate = 0x0
[05:26:36.911] planstart = {ticks = }
[05:26:36.911] planduration = {ticks = 1103806595203}
[05:26:36.911] bufusage_start = {shared_blks_hit = 1,
shared_blks_read = 16443, shared_blks_dirtied = 16443,
shared_blks_written = 8, local_blks_hit = 94307489783240,
local_blks_read = 94307451894117, local_blks_dirtied = 94307489783240,
local_blks_written = 140727004487184, temp_blks_read = 0,
temp_blks_written = 94307489783416, shared_blk_read_time = {ticks =
0}, shared_blk_write_time = {ticks = 94307489780192},
local_blk_read_time = {ticks = 0}, local_blk_write_time = {ticks =
94307491025040}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time
= {ticks = 0}}
[05:26:36.911] bufusage = {shared_blks_hit = 140727004486192,
shared_blks_read = 140061866319832, shared_blks_dirtied = 8,
shared_blks_written = 94307447196988, local_blks_hit = 34359738376,
local_blks_read = 94307489783240, local_blks_dirtied = 70622147264512,
local_blks_written = 94307491357144, temp_blks_read = 140061866319832,
temp_blks_written = 94307489783240, shared_blk_read_time = {ticks =
140727004486192}, shared_blk_write_time = {ticks = 140061866319832},
local_blk_read_time = {ticks = 8}, local_blk_write_time = {ticks =
94307489783240}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time
= {ticks = 94307447197163}}

Re: Support prepared statement invalidation when result types change

2023-09-18 Thread Jelte Fennema-Nio
@Euler thanks for the review. I addressed the feedback.

On Fri, 15 Sept 2023 at 01:41, Andy Fan  wrote:
> What if a client has *cached* an old version of RowDescription
> and the server changed it to something new and sent resultdata
> with the new RowDescription.  Will the client still be able to work
> expectly?

It depends a bit on the exact change. For instance a column being
added to the end of the resultdata shouldn't be a problem. And that is
actually quite a common case for this issue:
1. PREPARE p as (SELECT * FROM t);
2. ALTER TABLE t ADD COLUMN ...
3. EXECUTE p

But type changes of existing columns might cause issues when the
RowDescription is cached. But such changes also cause issues now.
Currently the prepared statement becomes unusable when this happens
(returning errors every time). With this patch it's at least possible
to have prepared statements continue working in many cases.
Furthermore caching RowDescription is also not super useful, most
clients request it every time because it does not require an extra
round trip, so there's almost no overhead in requesting it.

Clients caching ParameterDescription seems more useful because
fetching the parameter types does require an extra round trip. So
caching it could cause errors with 0003. But right now if the argument
types need to change it gives an error every time when executing the
prepared statement. So I believe 0003 is still an improvement over the
status quo, because there are many cases where the client knows that
the types might have changed and it thus needs to re-fetch the
ParameterDescription: the most common case is changing the
search_path. And there's also cases where even a cached
ParamaterDescription will work fine: e.g. the type is changed but the
encoding stays the same (e.g. drop + create an enum, or text/varchar,
or the text encoding of int and bigint)


v5-0003-Support-changing-argument-types-of-prepared-state.patch
Description: Binary data


v5-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch
Description: Binary data


v5-0001-Support-prepared-statement-invalidation-when-resu.patch
Description: Binary data


Re: Support prepared statement invalidation when result types change

2023-09-14 Thread Andy Fan
Hi,


>
> This requirement was not documented anywhere and it
> can thus be a surprising error to hit. But it's actually not needed for
> this to be an error, as long as we send the correct RowDescription there
> does not have to be a problem for clients when the result types or
> column counts change.
>

What if a client has *cached* an old version of RowDescription
and the server changed it to something new and sent resultdata
with the new RowDescription.  Will the client still be able to work
expectly?

I don't hope my concern is right since I didn't go through any of
the drivers in detail, but I hope my concern is expressed correctly.


-- 
Best Regards
Andy Fan


Re: Support prepared statement invalidation when result types change

2023-09-13 Thread Euler Taveira
On Tue, Sep 12, 2023, at 10:17 AM, Jelte Fennema wrote:
> When running the Postgres JDBC tests with this patchset I found dumb
> mistake in my last patch where I didn't initialize the contents of
> orig_params correctly. This new patchset  fixes that.
> 
0001:

Don't you want to execute this code path only for EXECUTE command?
PORTAL_UTIL_SELECT includes other commands such as CALL, FETCH, SHOW, and
EXPLAIN. If so, check if commandTag is CMDTAG_EXECUTE.

Regarding tupDesc, you don't need to call FreeTupleDesc instead you can modify
PortalStart as

case PORTAL_UTIL_SELECT:

/*
 * We don't set snapshot here, because PortalRunUtility will
 * take care of it if needed.
 */
{
PlannedStmt *pstmt = PortalGetPrimaryStmt(portal);

Assert(pstmt->commandType == CMD_UTILITY);
/*
 * tupDesc will be filled by FillPortalStore later because
 * it might change due to replanning when ExecuteQuery calls
 * GetCachedPlan.
 */
if (portal->commandTag != CMDTAG_EXECUTE)
portal->tupDesc = 
UtilityTupleDescriptor(pstmt->utilityStmt);
}

Regarding the commit message, ...if the the result... should be fixed. The
sentence "it's actually not needed..." could be "It doesn't need to be an error
as long as it sends the RowDescription...". The sentence "This patch starts to
allow a prepared ..." could be "This patch allows a prepared ...".

0002:

You should remove this comment because it refers to the option you are
removing.

-  plan->cursor_options,
-  false);  /* not fixed result */
+  plan->cursor_options);   /* not fixed result */

You should also remove the sentence that refers to fixed_result in
CompleteCachedPlan.

* cursor_options: options bitmask to pass to planner
* fixed_result: true to disallow future changes in query's result tupdesc
*/
void
CompleteCachedPlan(CachedPlanSource *plansource,
   List *querytree_list,
   MemoryContext querytree_context,

0003:

You should initialize the new parameters (orig_param_types and orig_num_params)
in CreateCachedPlan. One suggestion is to move the following code to
CompleteCachedPlan because plansource->param_types are assigned there.

@@ -108,6 +108,10 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,

argtypes[i++] = toid;
}
+
+   plansource->orig_num_params = nargs;
+   plansource->orig_param_types = MemoryContextAlloc(plansource->context, 
nargs * sizeof(Oid));
+   memcpy(plansource->orig_param_types, argtypes, nargs * sizeof(Oid));
}

This comment is confusing. Since the new function
(GetCachedPlanFromRevalidated) contains almost all code from GetCachedPlan, its
content is the same as the *previous* GetCachedPlan function. You could expand
this comment a bit to make it clear that it contains the logic to decide
between generic x custom plan. I don't like the function name but have only a
not-so-good suggestion: GetCachedPlanAfterRevalidate. I also don't like the
revalidationResult as a variable name. Why don't you keep qlist? Or use a name
near query-tree list (query_list? qtlist?). s/te caller/the caller/

+ * GetCachedPlanFromRevalidated: is the same as get GetCachedPlan, but requires
+ * te caller to first revalidate the query. This is needed for callers that
+ * need to use the revalidated plan to generate boundParams.
+ */
+CachedPlan *
+GetCachedPlanFromRevalidated(CachedPlanSource *plansource,
+ParamListInfo boundParams,
+ResourceOwner owner,
+QueryEnvironment *queryEnv,
+List *revalidationResult)


Are these names accurate? The original are the current ones; new ones are
"runtime" data. It would be good to explain why a new array is required.

Oid*param_types;/* array of parameter type OIDs, or NULL */
int num_params; /* length of param_types array */
+   Oid*orig_param_types;   /* array of original parameter type OIDs,
+* or NULL */
+   int orig_num_params;/* length of orig_param_types array */

You should expand the commit message a bit. Explain this feature. Inform the
behavior change.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Support prepared statement invalidation when result types change

2023-09-12 Thread Jelte Fennema
When running the Postgres JDBC tests with this patchset I found dumb
mistake in my last patch where I didn't initialize the contents of
orig_params correctly. This new patchset  fixes that.


v4-0001-Support-prepared-statement-invalidation-when-resu.patch
Description: Binary data


v4-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch
Description: Binary data


v4-0003-Support-changing-argument-types-of-prepared-state.patch
Description: Binary data


Re: Support prepared statement invalidation when result types change

2023-09-08 Thread Jelte Fennema
Another similar issue was reported on the PgBouncer prepared statement PR[1].

It's related to an issue that was already reported on the
mailinglist[2]. It turns out that invalidation of the argument types
is also important in some cases. The newly added 3rd patch in this
series addresses that issue.

[1]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1309454695
[2]: 
https://www.postgresql.org/message-id/flat/CA%2Bmi_8YAGf9qibDFTRNKgaTwaBa1OUcteKqLAxfMmKFbo3GHZg%40mail.gmail.com
From 3bfc47cfa333158c2e0a2e0603e311d141fc473b Mon Sep 17 00:00:00 2001
From: Jelte Fennema 
Date: Fri, 25 Aug 2023 17:09:38 +0200
Subject: [PATCH v3 1/3] Support prepared statement invalidation when result
 types change

The cached plan for a prepared statements can get invalidated when DDL
changes the tables used in the query, or when search_path changes. When
this happens the prepared statement can still be executed, but it will
be replanned in the new context. This means that the prepared statement
will do something different e.g. in case of search_path changes it will
select data from a completely different table. This won't throw an
error, because it is considered the responsibility of the operator and
query writers that the query will still do the intended thing.

However, we would throw an error if the the result of the query is of a
different type than it was before. This was not documented anywhere and
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.

This patch starts to allow a prepared statement to continue to work even
when the result type changes.

Without this change all clients that automatically prepare queries as a
performance optimization will need to handle or avoid the error somehow,
often resulting in deallocating and re-preparing queries when its
usually not necessary. With this change connection poolers can also
safely prepare the same query only once on a connection and share this
one prepared query across clients that prepared that exact same query.
---
 src/backend/tcop/postgres.c | 26 -
 src/backend/tcop/pquery.c   | 16 +++
 src/backend/utils/cache/plancache.c |  5 -
 src/test/regress/expected/plancache.out | 24 +--
 src/test/regress/sql/plancache.sql  |  7 +++
 5 files changed, 62 insertions(+), 16 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 21b9763183e..8ea654a4c26 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1252,7 +1252,31 @@ exec_simple_query(const char *query_string)
 	format = 1; /* BINARY */
 			}
 		}
-		PortalSetResultFormat(portal, 1, );
+
+		if (portal->strategy == PORTAL_UTIL_SELECT)
+		{
+			/*
+			 * For SELECT-like queries we clear the tupDesc now, and it will
+			 * be filled in later by FillPortalStore, because the tupDesc
+			 * might change due to replanning when ExecuteQuery calls
+			 * GetCachedPlan. So we should only fetch the tupDesc after the
+			 * query is actually executed. This also means that we cannot set
+			 * the result format for the output tuple yet, so we temporarily
+			 * store the desired format in portal->formats. Then after
+			 * creating the actual tupDesc we call PortalSetResultFormat,
+			 * using this format.
+			 */
+			FreeTupleDesc(portal->tupDesc);
+			portal->tupDesc = NULL;
+
+			portal->formats = (int16 *) MemoryContextAlloc(portal->portalContext,
+		   sizeof(int16));
+			portal->formats[0] = format;
+		}
+		else
+		{
+			PortalSetResultFormat(portal, 1, );
+		}
 
 		/*
 		 * Now we can create the destination receiver object.
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5565f200c3d..4f7633faf46 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1030,6 +1030,22 @@ FillPortalStore(Portal portal, bool isTopLevel)
 		case PORTAL_UTIL_SELECT:
 			PortalRunUtility(portal, linitial_node(PlannedStmt, portal->stmts),
 			 isTopLevel, true, treceiver, );
+
+			if (!portal->tupDesc)
+			{
+/*
+ * Now fill in the tupDesc and formats fields of the portal.
+ * We delayed this because for EXECUTE queries we only know
+ * the tuple after they were executed, because its original
+ * plan might get replaced with a new one that returns
+ * different columns, due to the table having changed since
+ * the last PREPARE/EXECUTE command.
+ */
+PlannedStmt *pstmt = PortalGetPrimaryStmt(portal);
+
+portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt);
+PortalSetResultFormat(portal, 1, portal->formats);
+			}
 			break;
 
 		default:
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 7d4168f82f5..a6fbe2

Re: Support prepared statement invalidation when result types change

2023-08-29 Thread Jelte Fennema
On Tue, 29 Aug 2023 at 11:29, jian he  wrote:
> regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl;
> SELECT 5
> regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
> WHERE q1 = $1;'
> PREPARE
> regression=# alter table pcachetest rename q1 to x;
> ALTER TABLE
> regression=# EXECUTE prepstmt2(123);
> 2023-08-29 17:23:59.148 CST [1382074] ERROR:  column "q1" does not
> exist at character 61
> 2023-08-29 17:23:59.148 CST [1382074] HINT:  Perhaps you meant to
> reference the column "pcachetest.q2".
> 2023-08-29 17:23:59.148 CST [1382074] STATEMENT:  EXECUTE prepstmt2(123);
> ERROR:  column "q1" does not exist
> HINT:  Perhaps you meant to reference the column "pcachetest.q2".

Thank you for the full set of commands. In that case the issue you're
describing is completely separate from this patch. The STATEMENT: part
of the log will always show the query that was received by the server.
This behaviour was already present even without my patch (I double
checked with PG15.3).




Re: Support prepared statement invalidation when result types change

2023-08-29 Thread jian he
On Tue, Aug 29, 2023 at 12:51 AM Jelte Fennema  wrote:
>
> Could you share the full set of commands that cause the reporting
> issue? I don't think my changes should impact this reporting, so I'm
> curious if this is a new issue, or an already existing one.

I didn't apply your v2 patch.
full set of commands:

regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl;
SELECT 5
regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
WHERE q1 = $1;'
PREPARE
regression=# alter table pcachetest rename q1 to x;
ALTER TABLE
regression=# EXECUTE prepstmt2(123);
2023-08-29 17:23:59.148 CST [1382074] ERROR:  column "q1" does not
exist at character 61
2023-08-29 17:23:59.148 CST [1382074] HINT:  Perhaps you meant to
reference the column "pcachetest.q2".
2023-08-29 17:23:59.148 CST [1382074] STATEMENT:  EXECUTE prepstmt2(123);
ERROR:  column "q1" does not exist
HINT:  Perhaps you meant to reference the column "pcachetest.q2".
--




Re: Support prepared statement invalidation when result types change

2023-08-28 Thread Jelte Fennema
On Mon, 28 Aug 2023 at 11:27, jian he  wrote:
> With parameters, it also works, only a tiny issue with error reporting.
>
> prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
> WHERE q1 = $1; | {bigint}| {bigint,bigint,bigint}
> ERROR:  column "q1" does not exist at character 61
> HINT:  Perhaps you meant to reference the column "pcachetest.x1".
> STATEMENT:  execute prepstmt2(1);
>
> I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT *
> FROM pcachetest WHERE q1 = $1;"
> so maybe the STATEMENT is slightly misleading.

Could you share the full set of commands that cause the reporting
issue? I don't think my changes should impact this reporting, so I'm
curious if this is a new issue, or an already existing one.




Re: Support prepared statement invalidation when result types change

2023-08-28 Thread Jelte Fennema
On Mon, 28 Aug 2023 at 15:05, Konstantin Knizhnik  wrote:
> The following assignment of format is not corrects:
>
> It has to be copied as below:
>
>  portal->formats = (int16 *)
>  MemoryContextAlloc(portal->portalContext,
> natts * sizeof(int16));
>  memcpy(portal->formats, formats, natts * sizeof(int16));

I attached a new version of the patch where I now did this. But I also
moved the code around quite a bit, since all this tupDesc/format
delaying is only needed for exec_simple_query. The original changes
actually broke some prepared statements that were using protocol level
Bind messages.


v2-0001-Support-prepared-statement-invalidation-when-resu.patch
Description: Binary data


v2-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch
Description: Binary data


Re: Support prepared statement invalidation when result types change

2023-08-28 Thread Konstantin Knizhnik




On 25.08.2023 8:57 PM, Jelte Fennema wrote:

The cached plan for a prepared statements can get invalidated when DDL
changes the tables used in the query, or when search_path changes. When
this happens the prepared statement can still be executed, but it will
be replanned in the new context. This means that the prepared statement
will do something different e.g. in case of search_path changes it will
select data from a completely different table. This won't throw an
error, because it is considered the responsibility of the operator and
query writers that the query will still do the intended thing.

However, we would throw an error if the the result of the query is of a
different type than it was before:
ERROR: cached plan must not change result type

This requirement was not documented anywhere and it
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.

This patch starts to allow a prepared statement to continue to work even
when the result type changes.

Without this change all clients that automatically prepare queries as a
performance optimization will need to handle or avoid the error somehow,
often resulting in deallocating and re-preparing queries when its
usually not necessary. With this change connection poolers can also
safely prepare the same query only once on a connection and share this
one prepared query across clients that prepared that exact same query.

Some relevant previous discussions:
[1]: 
https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com
[2]: 
https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type
[3]: 
https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error
[4]: https://github.com/pgjdbc/pgjdbc/pull/451
[5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551
[6]: https://github.com/jackc/pgx/issues/927
[7]: 
https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2
[8]: https://github.com/rails/rails/issues/12330



The following assignment of format is not corrects:

 /* Do nothing if portal won't return tuples */
 if (portal->tupDesc == NULL)
+    {
+        /*
+         * For SELECT like queries we delay filling in the tupDesc 
until after

+         * PortalRunUtility, because we don't know what rows an EXECUTE
+         * command will return. Because we delay setting tupDesc, we 
also need

+         * to delay setting formats. We do this in a pretty hacky way, by
+         * temporarily setting the portal formats to the passed in formats.
+         * Then once we fill in tupDesc, we call PortalSetResultFormat 
again

+         * with portal->formats to fill in the final formats value.
+         */
+        if (portal->strategy == PORTAL_UTIL_SELECT)
+            portal->formats = formats;
     return;


because it is create in other memory context:

postgres.c:
    /* Done storing stuff in portal's context */
    MemoryContextSwitchTo(oldContext);
    ...
 /* Get the result format codes */
    numRFormats = pq_getmsgint(input_message, 2);
    if (numRFormats > 0)
    {
        rformats = palloc_array(int16, numRFormats);
        for (int i = 0; i < numRFormats; i++)
            rformats[i] = pq_getmsgint(input_message, 2);
    }



It has to be copied as below:

    portal->formats = (int16 *)
    MemoryContextAlloc(portal->portalContext,
                       natts * sizeof(int16));
    memcpy(portal->formats, formats, natts * sizeof(int16));


or alternatively  MemoryContextSwitchTo(oldContext) should be moved 
after initialization of rformat





Re: Support prepared statement invalidation when result types change

2023-08-28 Thread jian he
On Sat, Aug 26, 2023 at 1:58 AM Jelte Fennema  wrote:
>
> The cached plan for a prepared statements can get invalidated when DDL
> changes the tables used in the query, or when search_path changes. When
> this happens the prepared statement can still be executed, but it will
> be replanned in the new context. This means that the prepared statement
> will do something different e.g. in case of search_path changes it will
> select data from a completely different table. This won't throw an
> error, because it is considered the responsibility of the operator and
> query writers that the query will still do the intended thing.
>
> However, we would throw an error if the the result of the query is of a
> different type than it was before:
> ERROR: cached plan must not change result type
>
> This requirement was not documented anywhere and it
> can thus be a surprising error to hit. But it's actually not needed for
> this to be an error, as long as we send the correct RowDescription there
> does not have to be a problem for clients when the result types or
> column counts change.
>
> This patch starts to allow a prepared statement to continue to work even
> when the result type changes.
>
> Without this change all clients that automatically prepare queries as a
> performance optimization will need to handle or avoid the error somehow,
> often resulting in deallocating and re-preparing queries when its
> usually not necessary. With this change connection poolers can also
> safely prepare the same query only once on a connection and share this
> one prepared query across clients that prepared that exact same query.
>
> Some relevant previous discussions:
> [1]: 
> https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com
> [2]: 
> https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type
> [3]: 
> https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error
> [4]: https://github.com/pgjdbc/pgjdbc/pull/451
> [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551
> [6]: https://github.com/jackc/pgx/issues/927
> [7]: 
> https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2
> [8]: https://github.com/rails/rails/issues/12330

prepared statement with no parameters, tested many cases (add column,
change column data type, rename column, set default, set not null), it
worked as expected.
With parameters, it also works, only a tiny issue with error reporting.

prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
WHERE q1 = $1; | {bigint}| {bigint,bigint,bigint}
ERROR:  column "q1" does not exist at character 61
HINT:  Perhaps you meant to reference the column "pcachetest.x1".
STATEMENT:  execute prepstmt2(1);

I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT *
FROM pcachetest WHERE q1 = $1;"
so maybe the STATEMENT is slightly misleading.




Support prepared statement invalidation when result types change

2023-08-25 Thread Jelte Fennema
The cached plan for a prepared statements can get invalidated when DDL
changes the tables used in the query, or when search_path changes. When
this happens the prepared statement can still be executed, but it will
be replanned in the new context. This means that the prepared statement
will do something different e.g. in case of search_path changes it will
select data from a completely different table. This won't throw an
error, because it is considered the responsibility of the operator and
query writers that the query will still do the intended thing.

However, we would throw an error if the the result of the query is of a
different type than it was before:
ERROR: cached plan must not change result type

This requirement was not documented anywhere and it
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.

This patch starts to allow a prepared statement to continue to work even
when the result type changes.

Without this change all clients that automatically prepare queries as a
performance optimization will need to handle or avoid the error somehow,
often resulting in deallocating and re-preparing queries when its
usually not necessary. With this change connection poolers can also
safely prepare the same query only once on a connection and share this
one prepared query across clients that prepared that exact same query.

Some relevant previous discussions:
[1]: 
https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com
[2]: 
https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type
[3]: 
https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error
[4]: https://github.com/pgjdbc/pgjdbc/pull/451
[5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551
[6]: https://github.com/jackc/pgx/issues/927
[7]: 
https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2
[8]: https://github.com/rails/rails/issues/12330
From b1a58082f2b226b37d237580e33b52438d480f48 Mon Sep 17 00:00:00 2001
From: Jelte Fennema 
Date: Fri, 25 Aug 2023 17:09:38 +0200
Subject: [PATCH v1 1/2] Support prepared statement invalidation when result
 types change

The cached plan for a prepared statements can get invalidated when DDL
changes the tables used in the query, or when search_path changes. When
this happens the prepared statement can still be executed, but it will
be replanned in the new context. This means that the prepared statement
will do something different e.g. in case of search_path changes it will
select data from a completely different table. This won't throw an
error, because it is considered the responsibility of the operator and
query writers that the query will still do the intended thing.

However, we would throw an error if the the result of the query is of a
different type than it was before. This was not documented anywhere and
can thus be a surprising error to hit. But it's actually not needed for
this to be an error, as long as we send the correct RowDescription there
does not have to be a problem for clients when the result types or
column counts change.

This patch starts to allow a prepared statement to continue to work even
when the result type changes.

Without this change all clients that automatically prepare queries as a
performance optimization will need to handle or avoid the error somehow,
often resulting in deallocating and re-preparing queries when its
usually not necessary. With this change connection poolers can also
safely prepare the same query only once on a connection and share this
one prepared query across clients that prepared that exact same query.
---
 src/backend/tcop/pquery.c   | 46 +++--
 src/backend/utils/cache/plancache.c |  5 ---
 src/test/regress/expected/plancache.out | 24 +
 src/test/regress/sql/plancache.sql  |  7 ++--
 4 files changed, 56 insertions(+), 26 deletions(-)

diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5565f200c3d..ee790009cd2 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -561,23 +561,21 @@ PortalStart(Portal portal, ParamListInfo params,
 
 			case PORTAL_UTIL_SELECT:
 
-/*
- * We don't set snapshot here, because PortalRunUtility will
- * take care of it if needed.
- */
-{
-	PlannedStmt *pstmt = PortalGetPrimaryStmt(portal);
-
-	Assert(pstmt->commandType == CMD_UTILITY);
-	portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt);
-}
-
 /*
  * Reset cursor position data to "start of query"
  */
 portal->atStart = true;
 portal->atEnd = false;	/* allow fetches */
 portal->portalPos = 0;
+