Re: Support prepared statement invalidation when result types change
On Sun, 7 Jan 2024 at 07:55, vignesh C wrote: > One of the test has aborted in CFBot at [1] with: Rebased the patchset and removed patch 0003 since it was causing the CI issue reported by vignesh and it seems much less useful and more controversial to me anyway (due to the extra required roundtrip). On Sun, 7 Jan 2024 at 09:17, Shay Rojansky wrote: > Just to point out, FWIW, that the .NET Npgsql driver does indeed cache > RowDescriptions... The whole point of preparation is to optimize things as > much as possible for repeated execution of the query; I get that the value > there is much lower than e.g. doing another network roundtrip, but that's > still extra work that's better off being cut if it can be. Hmm, interesting. I totally agree that it's always good to do less when possible. The problem is that in the face of server side prepared statement invalidations due to DDL changes to the table or search path changes, the row types might change. Or the server needs to constantly throw an error, like it does now, but that seems worse imho. I'm wondering though if we can create a middleground, where a client can still cache the RowDescription client side when no DDL or search_patch changes are happening. But still tell the client about a new RowDescription when they do happen. The only way of doing that I can think of is changing the Postgres protocol in a way similar to this: Allow Execute to return a RowDescription too, but have the server only do so when the previously received RowDescription for this prepared statement is now invalid. This would definitely require some additional tracking at PgBouncer to make it work, i.e. per client and per server it should now keep track of the last RowDescription for each prepared statement. But that's definitely something we could do. This would make this patch much more involved though, as now it would suddenly involve an actual protocol change, and that basically depends on this patch moving forward[1]. [1]: https://www.postgresql.org/message-id/flat/cageczqtg2hcmb5gau53uuwcdc7gcnjfll6mnw0wnhwhgq9u...@mail.gmail.com v6-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch Description: Binary data v6-0001-Support-prepared-statement-invalidation-when-resu.patch Description: Binary data
Re: Support prepared statement invalidation when result types change
On Mon, Sep 18, 2023 at 1:31 PM Jelte Fennema-Nio wrote: > Furthermore caching RowDescription is also not super useful, most > clients request it every time because it does not require an extra > round trip, so there's almost no overhead in requesting it. Just to point out, FWIW, that the .NET Npgsql driver does indeed cache RowDescriptions... The whole point of preparation is to optimize things as much as possible for repeated execution of the query; I get that the value there is much lower than e.g. doing another network roundtrip, but that's still extra work that's better off being cut if it can be.
Re: Support prepared statement invalidation when result types change
On Mon, 18 Sept 2023 at 18:01, Jelte Fennema-Nio wrote: > > @Euler thanks for the review. I addressed the feedback. > > On Fri, 15 Sept 2023 at 01:41, Andy Fan wrote: > > What if a client has *cached* an old version of RowDescription > > and the server changed it to something new and sent resultdata > > with the new RowDescription. Will the client still be able to work > > expectly? > > It depends a bit on the exact change. For instance a column being > added to the end of the resultdata shouldn't be a problem. And that is > actually quite a common case for this issue: > 1. PREPARE p as (SELECT * FROM t); > 2. ALTER TABLE t ADD COLUMN ... > 3. EXECUTE p > > But type changes of existing columns might cause issues when the > RowDescription is cached. But such changes also cause issues now. > Currently the prepared statement becomes unusable when this happens > (returning errors every time). With this patch it's at least possible > to have prepared statements continue working in many cases. > Furthermore caching RowDescription is also not super useful, most > clients request it every time because it does not require an extra > round trip, so there's almost no overhead in requesting it. > > Clients caching ParameterDescription seems more useful because > fetching the parameter types does require an extra round trip. So > caching it could cause errors with 0003. But right now if the argument > types need to change it gives an error every time when executing the > prepared statement. So I believe 0003 is still an improvement over the > status quo, because there are many cases where the client knows that > the types might have changed and it thus needs to re-fetch the > ParameterDescription: the most common case is changing the > search_path. And there's also cases where even a cached > ParamaterDescription will work fine: e.g. the type is changed but the > encoding stays the same (e.g. drop + create an enum, or text/varchar, > or the text encoding of int and bigint) One of the test has aborted in CFBot at [1] with: [05:26:16.214] Core was generated by `postgres: postgres regression_pg_stat_statements [local] EXECUTE '. [05:26:16.214] Program terminated with signal SIGABRT, Aborted. [05:26:16.214] #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 [05:26:16.214] Download failed: Invalid argument. Continuing without source file ./signal/../sysdeps/unix/sysv/linux/raise.c. [05:26:16.392] [05:26:16.392] Thread 1 (Thread 0x7fbe1d997a40 (LWP 28738)): [05:26:16.392] #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 [05:26:36.911] #5 0x55c5aa523e71 in RevalidateCachedQuery (plansource=0x55c5ac811cf0, queryEnv=queryEnv@entry=0x0) at ../src/backend/utils/cache/plancache.c:730 [05:26:36.911] num_params = 0 [05:26:36.911] param_types = 0x55c5ac860438 [05:26:36.911] snapshot_set = false [05:26:36.911] rawtree = 0x55c5ac859f08 [05:26:36.911] tlist = [05:26:36.911] qlist = [05:26:36.911] resultDesc = [05:26:36.911] querytree_context = [05:26:36.911] oldcxt = [05:26:36.911] #6 0x55c5a9de0262 in ExplainExecuteQuery (execstmt=0x55c5ac6d9d38, into=0x0, es=0x55c5ac859648, queryString=0x55c5ac6d91e0 "EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st6;", params=0x0, queryEnv=0x0) at ../src/backend/commands/prepare.c:594 [05:26:36.911] entry = 0x55c5ac839ba8 [05:26:36.911] query_string = [05:26:36.911] cplan = [05:26:36.911] plan_list = [05:26:36.911] p = [05:26:36.911] paramLI = 0x0 [05:26:36.911] estate = 0x0 [05:26:36.911] planstart = {ticks = } [05:26:36.911] planduration = {ticks = 1103806595203} [05:26:36.911] bufusage_start = {shared_blks_hit = 1, shared_blks_read = 16443, shared_blks_dirtied = 16443, shared_blks_written = 8, local_blks_hit = 94307489783240, local_blks_read = 94307451894117, local_blks_dirtied = 94307489783240, local_blks_written = 140727004487184, temp_blks_read = 0, temp_blks_written = 94307489783416, shared_blk_read_time = {ticks = 0}, shared_blk_write_time = {ticks = 94307489780192}, local_blk_read_time = {ticks = 0}, local_blk_write_time = {ticks = 94307491025040}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time = {ticks = 0}} [05:26:36.911] bufusage = {shared_blks_hit = 140727004486192, shared_blks_read = 140061866319832, shared_blks_dirtied = 8, shared_blks_written = 94307447196988, local_blks_hit = 34359738376, local_blks_read = 94307489783240, local_blks_dirtied = 70622147264512, local_blks_written = 94307491357144, temp_blks_read = 140061866319832, temp_blks_written = 94307489783240, shared_blk_read_time = {ticks = 140727004486192}, shared_blk_write_time = {ticks = 140061866319832}, local_blk_read_time = {ticks = 8}, local_blk_write_time = {ticks = 94307489783240}, temp_blk_read_time = {ticks = 0}, temp_blk_write_time = {ticks = 94307447197163}}
Re: Support prepared statement invalidation when result types change
@Euler thanks for the review. I addressed the feedback. On Fri, 15 Sept 2023 at 01:41, Andy Fan wrote: > What if a client has *cached* an old version of RowDescription > and the server changed it to something new and sent resultdata > with the new RowDescription. Will the client still be able to work > expectly? It depends a bit on the exact change. For instance a column being added to the end of the resultdata shouldn't be a problem. And that is actually quite a common case for this issue: 1. PREPARE p as (SELECT * FROM t); 2. ALTER TABLE t ADD COLUMN ... 3. EXECUTE p But type changes of existing columns might cause issues when the RowDescription is cached. But such changes also cause issues now. Currently the prepared statement becomes unusable when this happens (returning errors every time). With this patch it's at least possible to have prepared statements continue working in many cases. Furthermore caching RowDescription is also not super useful, most clients request it every time because it does not require an extra round trip, so there's almost no overhead in requesting it. Clients caching ParameterDescription seems more useful because fetching the parameter types does require an extra round trip. So caching it could cause errors with 0003. But right now if the argument types need to change it gives an error every time when executing the prepared statement. So I believe 0003 is still an improvement over the status quo, because there are many cases where the client knows that the types might have changed and it thus needs to re-fetch the ParameterDescription: the most common case is changing the search_path. And there's also cases where even a cached ParamaterDescription will work fine: e.g. the type is changed but the encoding stays the same (e.g. drop + create an enum, or text/varchar, or the text encoding of int and bigint) v5-0003-Support-changing-argument-types-of-prepared-state.patch Description: Binary data v5-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch Description: Binary data v5-0001-Support-prepared-statement-invalidation-when-resu.patch Description: Binary data
Re: Support prepared statement invalidation when result types change
Hi, > > This requirement was not documented anywhere and it > can thus be a surprising error to hit. But it's actually not needed for > this to be an error, as long as we send the correct RowDescription there > does not have to be a problem for clients when the result types or > column counts change. > What if a client has *cached* an old version of RowDescription and the server changed it to something new and sent resultdata with the new RowDescription. Will the client still be able to work expectly? I don't hope my concern is right since I didn't go through any of the drivers in detail, but I hope my concern is expressed correctly. -- Best Regards Andy Fan
Re: Support prepared statement invalidation when result types change
On Tue, Sep 12, 2023, at 10:17 AM, Jelte Fennema wrote: > When running the Postgres JDBC tests with this patchset I found dumb > mistake in my last patch where I didn't initialize the contents of > orig_params correctly. This new patchset fixes that. > 0001: Don't you want to execute this code path only for EXECUTE command? PORTAL_UTIL_SELECT includes other commands such as CALL, FETCH, SHOW, and EXPLAIN. If so, check if commandTag is CMDTAG_EXECUTE. Regarding tupDesc, you don't need to call FreeTupleDesc instead you can modify PortalStart as case PORTAL_UTIL_SELECT: /* * We don't set snapshot here, because PortalRunUtility will * take care of it if needed. */ { PlannedStmt *pstmt = PortalGetPrimaryStmt(portal); Assert(pstmt->commandType == CMD_UTILITY); /* * tupDesc will be filled by FillPortalStore later because * it might change due to replanning when ExecuteQuery calls * GetCachedPlan. */ if (portal->commandTag != CMDTAG_EXECUTE) portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt); } Regarding the commit message, ...if the the result... should be fixed. The sentence "it's actually not needed..." could be "It doesn't need to be an error as long as it sends the RowDescription...". The sentence "This patch starts to allow a prepared ..." could be "This patch allows a prepared ...". 0002: You should remove this comment because it refers to the option you are removing. - plan->cursor_options, - false); /* not fixed result */ + plan->cursor_options); /* not fixed result */ You should also remove the sentence that refers to fixed_result in CompleteCachedPlan. * cursor_options: options bitmask to pass to planner * fixed_result: true to disallow future changes in query's result tupdesc */ void CompleteCachedPlan(CachedPlanSource *plansource, List *querytree_list, MemoryContext querytree_context, 0003: You should initialize the new parameters (orig_param_types and orig_num_params) in CreateCachedPlan. One suggestion is to move the following code to CompleteCachedPlan because plansource->param_types are assigned there. @@ -108,6 +108,10 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt, argtypes[i++] = toid; } + + plansource->orig_num_params = nargs; + plansource->orig_param_types = MemoryContextAlloc(plansource->context, nargs * sizeof(Oid)); + memcpy(plansource->orig_param_types, argtypes, nargs * sizeof(Oid)); } This comment is confusing. Since the new function (GetCachedPlanFromRevalidated) contains almost all code from GetCachedPlan, its content is the same as the *previous* GetCachedPlan function. You could expand this comment a bit to make it clear that it contains the logic to decide between generic x custom plan. I don't like the function name but have only a not-so-good suggestion: GetCachedPlanAfterRevalidate. I also don't like the revalidationResult as a variable name. Why don't you keep qlist? Or use a name near query-tree list (query_list? qtlist?). s/te caller/the caller/ + * GetCachedPlanFromRevalidated: is the same as get GetCachedPlan, but requires + * te caller to first revalidate the query. This is needed for callers that + * need to use the revalidated plan to generate boundParams. + */ +CachedPlan * +GetCachedPlanFromRevalidated(CachedPlanSource *plansource, +ParamListInfo boundParams, +ResourceOwner owner, +QueryEnvironment *queryEnv, +List *revalidationResult) Are these names accurate? The original are the current ones; new ones are "runtime" data. It would be good to explain why a new array is required. Oid*param_types;/* array of parameter type OIDs, or NULL */ int num_params; /* length of param_types array */ + Oid*orig_param_types; /* array of original parameter type OIDs, +* or NULL */ + int orig_num_params;/* length of orig_param_types array */ You should expand the commit message a bit. Explain this feature. Inform the behavior change. -- Euler Taveira EDB https://www.enterprisedb.com/
Re: Support prepared statement invalidation when result types change
When running the Postgres JDBC tests with this patchset I found dumb mistake in my last patch where I didn't initialize the contents of orig_params correctly. This new patchset fixes that. v4-0001-Support-prepared-statement-invalidation-when-resu.patch Description: Binary data v4-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch Description: Binary data v4-0003-Support-changing-argument-types-of-prepared-state.patch Description: Binary data
Re: Support prepared statement invalidation when result types change
Another similar issue was reported on the PgBouncer prepared statement PR[1]. It's related to an issue that was already reported on the mailinglist[2]. It turns out that invalidation of the argument types is also important in some cases. The newly added 3rd patch in this series addresses that issue. [1]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1309454695 [2]: https://www.postgresql.org/message-id/flat/CA%2Bmi_8YAGf9qibDFTRNKgaTwaBa1OUcteKqLAxfMmKFbo3GHZg%40mail.gmail.com From 3bfc47cfa333158c2e0a2e0603e311d141fc473b Mon Sep 17 00:00:00 2001 From: Jelte Fennema Date: Fri, 25 Aug 2023 17:09:38 +0200 Subject: [PATCH v3 1/3] Support prepared statement invalidation when result types change The cached plan for a prepared statements can get invalidated when DDL changes the tables used in the query, or when search_path changes. When this happens the prepared statement can still be executed, but it will be replanned in the new context. This means that the prepared statement will do something different e.g. in case of search_path changes it will select data from a completely different table. This won't throw an error, because it is considered the responsibility of the operator and query writers that the query will still do the intended thing. However, we would throw an error if the the result of the query is of a different type than it was before. This was not documented anywhere and can thus be a surprising error to hit. But it's actually not needed for this to be an error, as long as we send the correct RowDescription there does not have to be a problem for clients when the result types or column counts change. This patch starts to allow a prepared statement to continue to work even when the result type changes. Without this change all clients that automatically prepare queries as a performance optimization will need to handle or avoid the error somehow, often resulting in deallocating and re-preparing queries when its usually not necessary. With this change connection poolers can also safely prepare the same query only once on a connection and share this one prepared query across clients that prepared that exact same query. --- src/backend/tcop/postgres.c | 26 - src/backend/tcop/pquery.c | 16 +++ src/backend/utils/cache/plancache.c | 5 - src/test/regress/expected/plancache.out | 24 +-- src/test/regress/sql/plancache.sql | 7 +++ 5 files changed, 62 insertions(+), 16 deletions(-) diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 21b9763183e..8ea654a4c26 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -1252,7 +1252,31 @@ exec_simple_query(const char *query_string) format = 1; /* BINARY */ } } - PortalSetResultFormat(portal, 1, ); + + if (portal->strategy == PORTAL_UTIL_SELECT) + { + /* + * For SELECT-like queries we clear the tupDesc now, and it will + * be filled in later by FillPortalStore, because the tupDesc + * might change due to replanning when ExecuteQuery calls + * GetCachedPlan. So we should only fetch the tupDesc after the + * query is actually executed. This also means that we cannot set + * the result format for the output tuple yet, so we temporarily + * store the desired format in portal->formats. Then after + * creating the actual tupDesc we call PortalSetResultFormat, + * using this format. + */ + FreeTupleDesc(portal->tupDesc); + portal->tupDesc = NULL; + + portal->formats = (int16 *) MemoryContextAlloc(portal->portalContext, + sizeof(int16)); + portal->formats[0] = format; + } + else + { + PortalSetResultFormat(portal, 1, ); + } /* * Now we can create the destination receiver object. diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c index 5565f200c3d..4f7633faf46 100644 --- a/src/backend/tcop/pquery.c +++ b/src/backend/tcop/pquery.c @@ -1030,6 +1030,22 @@ FillPortalStore(Portal portal, bool isTopLevel) case PORTAL_UTIL_SELECT: PortalRunUtility(portal, linitial_node(PlannedStmt, portal->stmts), isTopLevel, true, treceiver, ); + + if (!portal->tupDesc) + { +/* + * Now fill in the tupDesc and formats fields of the portal. + * We delayed this because for EXECUTE queries we only know + * the tuple after they were executed, because its original + * plan might get replaced with a new one that returns + * different columns, due to the table having changed since + * the last PREPARE/EXECUTE command. + */ +PlannedStmt *pstmt = PortalGetPrimaryStmt(portal); + +portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt); +PortalSetResultFormat(portal, 1, portal->formats); + } break; default: diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c index 7d4168f82f5..a6fbe2
Re: Support prepared statement invalidation when result types change
On Tue, 29 Aug 2023 at 11:29, jian he wrote: > regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl; > SELECT 5 > regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest > WHERE q1 = $1;' > PREPARE > regression=# alter table pcachetest rename q1 to x; > ALTER TABLE > regression=# EXECUTE prepstmt2(123); > 2023-08-29 17:23:59.148 CST [1382074] ERROR: column "q1" does not > exist at character 61 > 2023-08-29 17:23:59.148 CST [1382074] HINT: Perhaps you meant to > reference the column "pcachetest.q2". > 2023-08-29 17:23:59.148 CST [1382074] STATEMENT: EXECUTE prepstmt2(123); > ERROR: column "q1" does not exist > HINT: Perhaps you meant to reference the column "pcachetest.q2". Thank you for the full set of commands. In that case the issue you're describing is completely separate from this patch. The STATEMENT: part of the log will always show the query that was received by the server. This behaviour was already present even without my patch (I double checked with PG15.3).
Re: Support prepared statement invalidation when result types change
On Tue, Aug 29, 2023 at 12:51 AM Jelte Fennema wrote: > > Could you share the full set of commands that cause the reporting > issue? I don't think my changes should impact this reporting, so I'm > curious if this is a new issue, or an already existing one. I didn't apply your v2 patch. full set of commands: regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl; SELECT 5 regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest WHERE q1 = $1;' PREPARE regression=# alter table pcachetest rename q1 to x; ALTER TABLE regression=# EXECUTE prepstmt2(123); 2023-08-29 17:23:59.148 CST [1382074] ERROR: column "q1" does not exist at character 61 2023-08-29 17:23:59.148 CST [1382074] HINT: Perhaps you meant to reference the column "pcachetest.q2". 2023-08-29 17:23:59.148 CST [1382074] STATEMENT: EXECUTE prepstmt2(123); ERROR: column "q1" does not exist HINT: Perhaps you meant to reference the column "pcachetest.q2". --
Re: Support prepared statement invalidation when result types change
On Mon, 28 Aug 2023 at 11:27, jian he wrote: > With parameters, it also works, only a tiny issue with error reporting. > > prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest > WHERE q1 = $1; | {bigint}| {bigint,bigint,bigint} > ERROR: column "q1" does not exist at character 61 > HINT: Perhaps you meant to reference the column "pcachetest.x1". > STATEMENT: execute prepstmt2(1); > > I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT * > FROM pcachetest WHERE q1 = $1;" > so maybe the STATEMENT is slightly misleading. Could you share the full set of commands that cause the reporting issue? I don't think my changes should impact this reporting, so I'm curious if this is a new issue, or an already existing one.
Re: Support prepared statement invalidation when result types change
On Mon, 28 Aug 2023 at 15:05, Konstantin Knizhnik wrote: > The following assignment of format is not corrects: > > It has to be copied as below: > > portal->formats = (int16 *) > MemoryContextAlloc(portal->portalContext, > natts * sizeof(int16)); > memcpy(portal->formats, formats, natts * sizeof(int16)); I attached a new version of the patch where I now did this. But I also moved the code around quite a bit, since all this tupDesc/format delaying is only needed for exec_simple_query. The original changes actually broke some prepared statements that were using protocol level Bind messages. v2-0001-Support-prepared-statement-invalidation-when-resu.patch Description: Binary data v2-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch Description: Binary data
Re: Support prepared statement invalidation when result types change
On 25.08.2023 8:57 PM, Jelte Fennema wrote: The cached plan for a prepared statements can get invalidated when DDL changes the tables used in the query, or when search_path changes. When this happens the prepared statement can still be executed, but it will be replanned in the new context. This means that the prepared statement will do something different e.g. in case of search_path changes it will select data from a completely different table. This won't throw an error, because it is considered the responsibility of the operator and query writers that the query will still do the intended thing. However, we would throw an error if the the result of the query is of a different type than it was before: ERROR: cached plan must not change result type This requirement was not documented anywhere and it can thus be a surprising error to hit. But it's actually not needed for this to be an error, as long as we send the correct RowDescription there does not have to be a problem for clients when the result types or column counts change. This patch starts to allow a prepared statement to continue to work even when the result type changes. Without this change all clients that automatically prepare queries as a performance optimization will need to handle or avoid the error somehow, often resulting in deallocating and re-preparing queries when its usually not necessary. With this change connection poolers can also safely prepare the same query only once on a connection and share this one prepared query across clients that prepared that exact same query. Some relevant previous discussions: [1]: https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error [4]: https://github.com/pgjdbc/pgjdbc/pull/451 [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551 [6]: https://github.com/jackc/pgx/issues/927 [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2 [8]: https://github.com/rails/rails/issues/12330 The following assignment of format is not corrects: /* Do nothing if portal won't return tuples */ if (portal->tupDesc == NULL) + { + /* + * For SELECT like queries we delay filling in the tupDesc until after + * PortalRunUtility, because we don't know what rows an EXECUTE + * command will return. Because we delay setting tupDesc, we also need + * to delay setting formats. We do this in a pretty hacky way, by + * temporarily setting the portal formats to the passed in formats. + * Then once we fill in tupDesc, we call PortalSetResultFormat again + * with portal->formats to fill in the final formats value. + */ + if (portal->strategy == PORTAL_UTIL_SELECT) + portal->formats = formats; return; because it is create in other memory context: postgres.c: /* Done storing stuff in portal's context */ MemoryContextSwitchTo(oldContext); ... /* Get the result format codes */ numRFormats = pq_getmsgint(input_message, 2); if (numRFormats > 0) { rformats = palloc_array(int16, numRFormats); for (int i = 0; i < numRFormats; i++) rformats[i] = pq_getmsgint(input_message, 2); } It has to be copied as below: portal->formats = (int16 *) MemoryContextAlloc(portal->portalContext, natts * sizeof(int16)); memcpy(portal->formats, formats, natts * sizeof(int16)); or alternatively MemoryContextSwitchTo(oldContext) should be moved after initialization of rformat
Re: Support prepared statement invalidation when result types change
On Sat, Aug 26, 2023 at 1:58 AM Jelte Fennema wrote: > > The cached plan for a prepared statements can get invalidated when DDL > changes the tables used in the query, or when search_path changes. When > this happens the prepared statement can still be executed, but it will > be replanned in the new context. This means that the prepared statement > will do something different e.g. in case of search_path changes it will > select data from a completely different table. This won't throw an > error, because it is considered the responsibility of the operator and > query writers that the query will still do the intended thing. > > However, we would throw an error if the the result of the query is of a > different type than it was before: > ERROR: cached plan must not change result type > > This requirement was not documented anywhere and it > can thus be a surprising error to hit. But it's actually not needed for > this to be an error, as long as we send the correct RowDescription there > does not have to be a problem for clients when the result types or > column counts change. > > This patch starts to allow a prepared statement to continue to work even > when the result type changes. > > Without this change all clients that automatically prepare queries as a > performance optimization will need to handle or avoid the error somehow, > often resulting in deallocating and re-preparing queries when its > usually not necessary. With this change connection poolers can also > safely prepare the same query only once on a connection and share this > one prepared query across clients that prepared that exact same query. > > Some relevant previous discussions: > [1]: > https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com > [2]: > https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type > [3]: > https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error > [4]: https://github.com/pgjdbc/pgjdbc/pull/451 > [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551 > [6]: https://github.com/jackc/pgx/issues/927 > [7]: > https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2 > [8]: https://github.com/rails/rails/issues/12330 prepared statement with no parameters, tested many cases (add column, change column data type, rename column, set default, set not null), it worked as expected. With parameters, it also works, only a tiny issue with error reporting. prepstmt2 | PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest WHERE q1 = $1; | {bigint}| {bigint,bigint,bigint} ERROR: column "q1" does not exist at character 61 HINT: Perhaps you meant to reference the column "pcachetest.x1". STATEMENT: execute prepstmt2(1); I think "character 61" refer to "PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest WHERE q1 = $1;" so maybe the STATEMENT is slightly misleading.
Support prepared statement invalidation when result types change
The cached plan for a prepared statements can get invalidated when DDL changes the tables used in the query, or when search_path changes. When this happens the prepared statement can still be executed, but it will be replanned in the new context. This means that the prepared statement will do something different e.g. in case of search_path changes it will select data from a completely different table. This won't throw an error, because it is considered the responsibility of the operator and query writers that the query will still do the intended thing. However, we would throw an error if the the result of the query is of a different type than it was before: ERROR: cached plan must not change result type This requirement was not documented anywhere and it can thus be a surprising error to hit. But it's actually not needed for this to be an error, as long as we send the correct RowDescription there does not have to be a problem for clients when the result types or column counts change. This patch starts to allow a prepared statement to continue to work even when the result type changes. Without this change all clients that automatically prepare queries as a performance optimization will need to handle or avoid the error somehow, often resulting in deallocating and re-preparing queries when its usually not necessary. With this change connection poolers can also safely prepare the same query only once on a connection and share this one prepared query across clients that prepared that exact same query. Some relevant previous discussions: [1]: https://www.postgresql.org/message-id/flat/CAB%3DJe-GQOW7kU9Hn3AqP1vhaZg_wE9Lz6F4jSp-7cm9_M6DyVA%40mail.gmail.com [2]: https://stackoverflow.com/questions/2783813/postgres-error-cached-plan-must-not-change-result-type [3]: https://stackoverflow.com/questions/42119365/how-to-avoid-cached-plan-must-not-change-result-type-error [4]: https://github.com/pgjdbc/pgjdbc/pull/451 [5]: https://github.com/pgbouncer/pgbouncer/pull/845#discussion_r1305295551 [6]: https://github.com/jackc/pgx/issues/927 [7]: https://elixirforum.com/t/postgrex-errors-with-cached-plan-must-not-change-result-type-during-migration/51235/2 [8]: https://github.com/rails/rails/issues/12330 From b1a58082f2b226b37d237580e33b52438d480f48 Mon Sep 17 00:00:00 2001 From: Jelte Fennema Date: Fri, 25 Aug 2023 17:09:38 +0200 Subject: [PATCH v1 1/2] Support prepared statement invalidation when result types change The cached plan for a prepared statements can get invalidated when DDL changes the tables used in the query, or when search_path changes. When this happens the prepared statement can still be executed, but it will be replanned in the new context. This means that the prepared statement will do something different e.g. in case of search_path changes it will select data from a completely different table. This won't throw an error, because it is considered the responsibility of the operator and query writers that the query will still do the intended thing. However, we would throw an error if the the result of the query is of a different type than it was before. This was not documented anywhere and can thus be a surprising error to hit. But it's actually not needed for this to be an error, as long as we send the correct RowDescription there does not have to be a problem for clients when the result types or column counts change. This patch starts to allow a prepared statement to continue to work even when the result type changes. Without this change all clients that automatically prepare queries as a performance optimization will need to handle or avoid the error somehow, often resulting in deallocating and re-preparing queries when its usually not necessary. With this change connection poolers can also safely prepare the same query only once on a connection and share this one prepared query across clients that prepared that exact same query. --- src/backend/tcop/pquery.c | 46 +++-- src/backend/utils/cache/plancache.c | 5 --- src/test/regress/expected/plancache.out | 24 + src/test/regress/sql/plancache.sql | 7 ++-- 4 files changed, 56 insertions(+), 26 deletions(-) diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c index 5565f200c3d..ee790009cd2 100644 --- a/src/backend/tcop/pquery.c +++ b/src/backend/tcop/pquery.c @@ -561,23 +561,21 @@ PortalStart(Portal portal, ParamListInfo params, case PORTAL_UTIL_SELECT: -/* - * We don't set snapshot here, because PortalRunUtility will - * take care of it if needed. - */ -{ - PlannedStmt *pstmt = PortalGetPrimaryStmt(portal); - - Assert(pstmt->commandType == CMD_UTILITY); - portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt); -} - /* * Reset cursor position data to "start of query" */ portal->atStart = true; portal->atEnd = false; /* allow fetches */ portal->portalPos = 0; +