Re: Use get_call_result_type() more widely
On Wed, Dec 21, 2022 at 6:44 AM Michael Paquier wrote: > > I have applied v2-0001. Thanks for taking care of this. By seeing the impact that get_call_result_type() can have for the functions that are possibly called repeatedly, I couldn't resist sharing a patch (attached herewith) that adds a note of caution and another way to build TupleDesc in the documentation to help developers out there. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 3ee0c4a402bcf2d25a7c544ddd60d3e9742b9048 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Wed, 21 Dec 2022 06:38:27 + Subject: [PATCH v1] Add another way to build TupleDesc in documentation --- doc/src/sgml/xfunc.sgml | 44 + 1 file changed, 44 insertions(+) diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index cf5810b3c1..8b9c718fdf 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -2839,6 +2839,35 @@ TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo, cannot accept type record.) + + Another way to setup TupleDesc is to use the + following functions: + +TupleDesc CreateTemplateTupleDesc(int natts); +void TupleDescInitEntry(TupleDesc desc, +AttrNumber attributeNumber, +const char *attributeName, +Oid oidtypeid, +int32 typmod, +int attdim); +TupleDesc BlessTupleDesc(TupleDesc tupdesc); + +For example: + +/* + * Construct a tuple descriptor for the result row. The number of output + * columns, column names and types must match the function's pg_proc entry + * or function's definition. + */ +tupdesc = CreateTemplateTupleDesc(4); +TupleDescInitEntry(tupdesc, (AttrNumber) 1, "foo", INT4OID, -1, 0); +TupleDescInitEntry(tupdesc, (AttrNumber) 2, "bar", BOOLOID, -1, 0); +TupleDescInitEntry(tupdesc, (AttrNumber) 3, "baz", TEXTOID, -1, 0); +TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qux", TIMESTAMPTZOID, -1, 0); +tupdesc = BlessTupleDesc(tupdesc); + + + get_call_result_type can resolve the actual type of a @@ -2849,6 +2878,21 @@ TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo, + + +get_call_result_type is relatively expensive +as it has to look up system catalogs and do other things to setup +TupleDesc. While it is recommended +in most of the cases, remember to consider setting up +TupleDesc with +CreateTemplateTupleDesc, +TupleDescInitEntry and +BlessTupleDesc as shown above, if the function +returns set or the function is possible used repeatedly +(monitoring purposes, for instance). + + + get_call_result_type has a sibling -- 2.34.1
Re: Use get_call_result_type() more widely
On Tue, Dec 20, 2022 at 11:12:09AM -0500, Tom Lane wrote: > On the whole, I'm content to leave the BlessTupleDesc calls in > these callers. They are cheap enough if the tupdesc is already > blessed. Yeah, agreed. I have applied v2-0001, after fixing one error in wparser.c where some of the previous style was not removed, leading to unnecessary work and the same TupleDesc being built twice for the two ts_token_type()'s (input of OID or text). -- Michael signature.asc Description: PGP signature
Re: Use get_call_result_type() more widely
Bharath Rupireddy writes: > On Tue, Dec 20, 2022 at 1:41 PM Tom Lane wrote: >> Hmm ... at least one of the paths through internal_get_result_type >> is intentionally blessing the result tupdesc: >> but it's not clear if they all do, and the comments certainly >> aren't promising it. > It looks to be safe to get rid of BlessTupleDesc() after > get_call_result_type() for the functions that have OUT parameters and > return 'record' type. I think it's an absolutely horrid idea for callers to depend on such details of get_call_result_type's behavior --- especially when there is no function documentation promising it. If we want to do something here, the thing to do would be to guarantee in get_call_result_type's API spec that any returned tupledesc is blessed. However, that might make some other cases slower, if they don't need that. On the whole, I'm content to leave the BlessTupleDesc calls in these callers. They are cheap enough if the tupdesc is already blessed. regards, tom lane
Re: Use get_call_result_type() more widely
On Tue, Dec 20, 2022 at 1:41 PM Tom Lane wrote: > > Michael Paquier writes: > > On Mon, Dec 19, 2022 at 07:41:27PM +0530, Bharath Rupireddy wrote: > >> 0002 - gets rid of an unnecessary call to BlessTupleDesc() > >> after get_call_result_type(). > > > Hmm. I am not sure whether this is right, actually.. > > Hmm ... at least one of the paths through internal_get_result_type > is intentionally blessing the result tupdesc: > > if (tupdesc->tdtypeid == RECORDOID && > tupdesc->tdtypmod < 0) > assign_record_type_typmod(tupdesc); > > but it's not clear if they all do, and the comments certainly > aren't promising it. It looks to be safe to get rid of BlessTupleDesc() after get_call_result_type() for the functions that have OUT parameters and return 'record' type. This is because, the get_call_result_type()->internal_get_result_type()->build_function_result_tupdesc_t() returns non-NULL tupdesc for such functions and all the functions that 0002 patch touches are having OUT parameters and their return type is 'record'. I've also verified with Assert(tupdesc->tdtypmod >= 0); - https://github.com/BRupireddy/postgres/tree/test_for_tdypmod_init_v1. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Use get_call_result_type() more widely
Michael Paquier writes: > On Mon, Dec 19, 2022 at 07:41:27PM +0530, Bharath Rupireddy wrote: >> 0002 - gets rid of an unnecessary call to BlessTupleDesc() >> after get_call_result_type(). > Hmm. I am not sure whether this is right, actually.. Hmm ... at least one of the paths through internal_get_result_type is intentionally blessing the result tupdesc: if (tupdesc->tdtypeid == RECORDOID && tupdesc->tdtypmod < 0) assign_record_type_typmod(tupdesc); but it's not clear if they all do, and the comments certainly aren't promising it. I'd be in favor of making this a documented API promise, but it isn't that right now. regards, tom lane
Re: Use get_call_result_type() more widely
On Mon, Dec 19, 2022 at 07:41:27PM +0530, Bharath Rupireddy wrote: > I agree with the bucketization. Please see the attached patches. 0001 > - gets rid of explicit tuple desc creation using > get_call_result_type() for functions thought to be not-so-frequently > called. It looks like I am OK with the code paths updated here, which refer to none of the "critical" function paths. > 0002 - gets rid of an unnecessary call to BlessTupleDesc() > after get_call_result_type(). Hmm. I am not sure whether this is right, actually.. -- Michael signature.asc Description: PGP signature
Re: Use get_call_result_type() more widely
On Mon, Dec 19, 2022 at 05:50:03PM -0500, Robert Haas wrote: > All right, well, I just work here. :-) Just to give some numbers. The original version of the patch doing the full switch removed 500 lines of code. The second version that switches the "non-critical" paths removes 200~ lines. -- Michael signature.asc Description: PGP signature
Re: Use get_call_result_type() more widely
On Mon, Dec 19, 2022 at 4:21 PM Tom Lane wrote: > Now that somebody's made an effort to identify which places are > potentially performance-critical, I don't see why we wouldn't use > the fruits of their labor. Yes, somebody else might draw the line > differently, but drawing a line at all seems like a step forward > to me. All right, well, I just work here. :-) -- Robert Haas EDB: http://www.enterprisedb.com
Re: Use get_call_result_type() more widely
Robert Haas writes: > On Mon, Dec 19, 2022 at 2:07 PM Alvaro Herrera > wrote: >> On the other hand, the measurements have shown that going through the >> function is significantly slower. So I kinda like the judgement call >> that Michael and Bharath have made: change to use the function when >> performance is not an issue, and keep the verbose coding otherwise. > Seems fairly arbitrary to me. Agreed ... but the decisions embodied in the code-as-it-stands are even more arbitrary, being no doubt mostly based on "which function did you copy to start from" not on any thought about performance. Now that somebody's made an effort to identify which places are potentially performance-critical, I don't see why we wouldn't use the fruits of their labor. Yes, somebody else might draw the line differently, but drawing a line at all seems like a step forward to me. regards, tom lane
Re: Use get_call_result_type() more widely
On Mon, Dec 19, 2022 at 2:07 PM Alvaro Herrera wrote: > On the other hand, the measurements have shown that going through the > function is significantly slower. So I kinda like the judgement call > that Michael and Bharath have made: change to use the function when > performance is not an issue, and keep the verbose coding otherwise. Seems fairly arbitrary to me. The ones used for monitoring queries aren't likely to be run often enough that it matters, but in theory it's possible that they could be. Many of the ones supposedly not used for monitoring queries could reasonably be so used, too. You can get any answer you want by making arbitrary assumptions about which ones are likely to be used frequently and how frequently they're likely to be used, and I think different people evaluating the list independently of each other and with no knowledge of each others work would likely reach substantially different conclusions, ranging all the way from "do them all this way" to "do them all the other way" and various positions in the middle. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Use get_call_result_type() more widely
On 2022-Dec-19, Robert Haas wrote: > Here's a modest proposal: let's do nothing about this. There's no > evidence of a real problem here, so we're going to be trying to judge > the performance benefits against the code size savings without any > real data indicating that either one is an issue. I bet we could > convert all of these to one style or the other and it would make very > little real world difference, but deciding which ones to change and in > which direction will take up time and energy that could otherwise be > spent on more worthwhile projects, and could possibly complicate > back-patching, too. > > Basically, I think this is nit-picking. Let's just accept that both > styles have some advantages and leave it up to patch authors to pick > one that they prefer. The code savings are substantial actually, so I think bloating things for cases where performance is not an issue is not good. Some other developer is sure to cargo-cult that stuff in the future, and that's not great. On the other hand, the measurements have shown that going through the function is significantly slower. So I kinda like the judgement call that Michael and Bharath have made: change to use the function when performance is not an issue, and keep the verbose coding otherwise. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: Use get_call_result_type() more widely
On Tue, Dec 13, 2022 at 10:43 AM Tom Lane wrote: > Saving code is nice, but I'd assume the result is slower, because > get_call_result_type has to do a pretty substantial amount of work > to get the data to construct the tupdesc from. Have you tried to > quantify how much overhead this'd add? Which of these functions > can we safely consider to be non-performance-critical? Here's a modest proposal: let's do nothing about this. There's no evidence of a real problem here, so we're going to be trying to judge the performance benefits against the code size savings without any real data indicating that either one is an issue. I bet we could convert all of these to one style or the other and it would make very little real world difference, but deciding which ones to change and in which direction will take up time and energy that could otherwise be spent on more worthwhile projects, and could possibly complicate back-patching, too. Basically, I think this is nit-picking. Let's just accept that both styles have some advantages and leave it up to patch authors to pick one that they prefer. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Use get_call_result_type() more widely
On Thu, Dec 15, 2022 at 11:41 AM Michael Paquier wrote: > > On Wed, Dec 14, 2022 at 11:14:59AM +0530, Bharath Rupireddy wrote: > > AFAICS, most of these functions have no direct source code callers, > > they're user-facing functions and not in a hot code path. I measured > > the test times of these functions and I don't see much difference [1]. > > Thanks for the summary. It looks like your tests involve single > runs. What is the difference in run-time when invoking this > repeatedly with a large generate_series() for example when few or no > tuples are returned? Do you see a difference in perf profile? Some > of the functions could have their time mostly eaten while looking at > the syscache on repeated calls, but you could see the actual work this > involves with a dummy function that returns a large number of > attributes on a single record in the worst case possible? Thanks. Yes, using get_call_result_type() for a function that gets called repeatedly does have some cost as the comment around get_call_result_type() says - I found in my testing that get_call_result_type() does seem to cost 45% increase in execution times over quick iterations of a function returning a single row with 36 columns. > Separating things into two buckets.. > > > [1] > > pg_old_snapshot_time_mapping() - an extension function with no > > internal source code callers, no test coverage. > > pg_visibility_map_summary() - an extension function with no internal > > source code callers, test coverage exists, test times on HEAD:25 ms > > PATCHED:25 ms > > pg_last_committed_xact() and pg_xact_commit_timestamp_origin() - no > > internal source code callers, test coverage exists, test times on > > HEAD:10 ms PATCHED:10 ms> pg_get_multixact_members() - no internal source > > code callers, no test coverage. > > pg_control_recovery() and pg_control_init() - test coverage exists, > > test times on HEAD:42 ms PATCHED:44 ms > > pg_identify_object() - no internal source code callers, test coverage > > exists, test times on HEAD:17 ms PATCHED:18 ms > > pg_identify_object_as_address() - no internal source code callers, > > test coverage exists, test times on HEAD:66 ms PATCHED:60 ms > > pg_get_object_address() - no internal source code callers, test > > coverage exists, test times on HEAD:66 ms PATCHED:60 ms > > pg_sequence_parameters() - no internal source code callers, test > > coverage exists, test times on HEAD:96 ms PATCHED:98 ms > > ts_token_type_byid(), ts_token_type_byname(), ts_parse_byid() and > > ts_parse_byname() - internal source code callers exists, test coverage > > exists, test times on HEAD:195 ms, pg_dump 10 wallclock secs > > PATCHED:186 ms, pg_dump 10 wallclock secs > > pg_get_keywords() - no internal source code callers, test coverage > > exists, test times on HEAD:129 ms PATCHED:130 ms > > pg_get_catalog_foreign_keys() - no internal source code callers, test > > coverage exists, test times on HEAD:114 ms PATCHED:111 ms > > tsvector_unnest() - no internal source code callers, test coverage > > exists, test times on HEAD:26 ms PATCHED:26 ms > > ts_setup_firstcall() - test coverage exists, test times on HEAD:195 ms > > PATCHED:186 ms > > pg_partition_tree() - no internal source code callers, test coverage > > exists, test times on HEAD:30 ms PATCHED:32 ms > > pg_timezone_abbrevs() - no internal source code callers, test coverage > > exists, test times on HEAD:40 ms PATCHED:36 ms > > These ones don't worry me much, TBH. > > > pg_stat_get_wal(), pg_stat_get_archiver() and > > pg_stat_get_replication_slot() - no internal source code callers, test > > coverage exists, test times on HEAD:479 ms PATCHED:483 ms > > pg_prepared_xact() - no internal source code callers, test coverage > > exists, test times on HEAD:50 ms, subscription 108 wallclock secs, > > recovery 111 wallclock secs PATCHED:48 ms, subscription 110 wallclock > > secs, recovery 112 wallclock secs > > show_all_settings(), pg_control_system(), pg_control_checkpoint(), > > test_predtest() - no internal source code callers, test coverage > > exists, test times on HEAD:18 ms PATCHED:18 ms > > pg_walfile_name_offset() - no internal source code callers, no test > > coverage. > > aclexplode() - internal callers exists information_schema.sql, > > indirect test coverage exists. > > pg_stat_file() - no internal source code callers, test coverage > > exists, test times on HEAD:42 ms PATCHED:46 ms > > pg_get_publication_tables() - internal source code callers exist, test > > coverage exists, test times on HEAD:159 ms, subscription 108 wallclock > > secs PATCHED:167 ms, subscription 110 wallclock secs > > pg_lock_status() - no internal source code callers, test coverage > > exists, test times on HEAD:16 ms PATCHED:22 ms > > pg_stat_get_subscription_stats() - no internal source code callers, > > test coverage exists, test times on HEAD:subscription 108 wallclock > > secs PATCHED:subscription 110 wallclock secs > > These ones could be involved in monitoring queries run on a
Re: Use get_call_result_type() more widely
On Wed, Dec 14, 2022 at 11:14:59AM +0530, Bharath Rupireddy wrote: > AFAICS, most of these functions have no direct source code callers, > they're user-facing functions and not in a hot code path. I measured > the test times of these functions and I don't see much difference [1]. Thanks for the summary. It looks like your tests involve single runs. What is the difference in run-time when invoking this repeatedly with a large generate_series() for example when few or no tuples are returned? Do you see a difference in perf profile? Some of the functions could have their time mostly eaten while looking at the syscache on repeated calls, but you could see the actual work this involves with a dummy function that returns a large number of attributes on a single record in the worst case possible? Separating things into two buckets.. > [1] > pg_old_snapshot_time_mapping() - an extension function with no > internal source code callers, no test coverage. > pg_visibility_map_summary() - an extension function with no internal > source code callers, test coverage exists, test times on HEAD:25 ms > PATCHED:25 ms > pg_last_committed_xact() and pg_xact_commit_timestamp_origin() - no > internal source code callers, test coverage exists, test times on > HEAD:10 ms PATCHED:10 ms> pg_get_multixact_members() - no internal source > code callers, no test coverage. > pg_control_recovery() and pg_control_init() - test coverage exists, > test times on HEAD:42 ms PATCHED:44 ms > pg_identify_object() - no internal source code callers, test coverage > exists, test times on HEAD:17 ms PATCHED:18 ms > pg_identify_object_as_address() - no internal source code callers, > test coverage exists, test times on HEAD:66 ms PATCHED:60 ms > pg_get_object_address() - no internal source code callers, test > coverage exists, test times on HEAD:66 ms PATCHED:60 ms > pg_sequence_parameters() - no internal source code callers, test > coverage exists, test times on HEAD:96 ms PATCHED:98 ms > ts_token_type_byid(), ts_token_type_byname(), ts_parse_byid() and > ts_parse_byname() - internal source code callers exists, test coverage > exists, test times on HEAD:195 ms, pg_dump 10 wallclock secs > PATCHED:186 ms, pg_dump 10 wallclock secs > pg_get_keywords() - no internal source code callers, test coverage > exists, test times on HEAD:129 ms PATCHED:130 ms > pg_get_catalog_foreign_keys() - no internal source code callers, test > coverage exists, test times on HEAD:114 ms PATCHED:111 ms > tsvector_unnest() - no internal source code callers, test coverage > exists, test times on HEAD:26 ms PATCHED:26 ms > ts_setup_firstcall() - test coverage exists, test times on HEAD:195 ms > PATCHED:186 ms > pg_partition_tree() - no internal source code callers, test coverage > exists, test times on HEAD:30 ms PATCHED:32 ms > pg_timezone_abbrevs() - no internal source code callers, test coverage > exists, test times on HEAD:40 ms PATCHED:36 ms These ones don't worry me much, TBH. > pg_stat_get_wal(), pg_stat_get_archiver() and > pg_stat_get_replication_slot() - no internal source code callers, test > coverage exists, test times on HEAD:479 ms PATCHED:483 ms > pg_prepared_xact() - no internal source code callers, test coverage > exists, test times on HEAD:50 ms, subscription 108 wallclock secs, > recovery 111 wallclock secs PATCHED:48 ms, subscription 110 wallclock > secs, recovery 112 wallclock secs > show_all_settings(), pg_control_system(), pg_control_checkpoint(), > test_predtest() - no internal source code callers, test coverage > exists, test times on HEAD:18 ms PATCHED:18 ms > pg_walfile_name_offset() - no internal source code callers, no test coverage. > aclexplode() - internal callers exists information_schema.sql, > indirect test coverage exists. > pg_stat_file() - no internal source code callers, test coverage > exists, test times on HEAD:42 ms PATCHED:46 ms > pg_get_publication_tables() - internal source code callers exist, test > coverage exists, test times on HEAD:159 ms, subscription 108 wallclock > secs PATCHED:167 ms, subscription 110 wallclock secs > pg_lock_status() - no internal source code callers, test coverage > exists, test times on HEAD:16 ms PATCHED:22 ms > pg_stat_get_subscription_stats() - no internal source code callers, > test coverage exists, test times on HEAD:subscription 108 wallclock > secs PATCHED:subscription 110 wallclock secs These ones could be involved in monitoring queries run on a periodic basis. -- Michael signature.asc Description: PGP signature
Re: Use get_call_result_type() more widely
On Tue, Dec 13, 2022 at 9:12 PM Tom Lane wrote: > > Bharath Rupireddy writes: > > A review comment in another thread [1] by Michael Paquier about the > > usage of get_call_result_type() instead of explicit building of > > TupleDesc made me think about using it more widely. Actually, the > > get_call_result_type() looks at the function definitions to figure the > > column names and build the required TupleDesc, usage of which avoids > > duplication of the column names between pg_proc.dat/function > > definitions and source code. Also, it saves a good number of LOC ~415 > > [2] and the size of all the object files put together gets reduced by > > ~4MB, which means, the postgres binary becomes leaner by ~4MB [3]. > > Saving code is nice, but I'd assume the result is slower, because > get_call_result_type has to do a pretty substantial amount of work > to get the data to construct the tupdesc from. Have you tried to > quantify how much overhead this'd add? Which of these functions > can we safely consider to be non-performance-critical? AFAICS, most of these functions have no direct source code callers, they're user-facing functions and not in a hot code path. I measured the test times of these functions and I don't see much difference [1]. [1] pg_old_snapshot_time_mapping() - an extension function with no internal source code callers, no test coverage. pg_visibility_map_summary() - an extension function with no internal source code callers, test coverage exists, test times on HEAD:25 ms PATCHED:25 ms pg_last_committed_xact() and pg_xact_commit_timestamp_origin() - no internal source code callers, test coverage exists, test times on HEAD:10 ms PATCHED:10 ms pg_get_multixact_members() - no internal source code callers, no test coverage. pg_prepared_xact() - no internal source code callers, test coverage exists, test times on HEAD:50 ms, subscription 108 wallclock secs, recovery 111 wallclock secs PATCHED:48 ms, subscription 110 wallclock secs, recovery 112 wallclock secs pg_walfile_name_offset() - no internal source code callers, no test coverage. pg_get_object_address() - no internal source code callers, test coverage exists, test times on HEAD:66 ms PATCHED:60 ms pg_identify_object() - no internal source code callers, test coverage exists, test times on HEAD:17 ms PATCHED:18 ms pg_identify_object_as_address() - no internal source code callers, test coverage exists, test times on HEAD:66 ms PATCHED:60 ms pg_get_publication_tables() - internal source code callers exist, test coverage exists, test times on HEAD:159 ms, subscription 108 wallclock secs PATCHED:167 ms, subscription 110 wallclock secs pg_sequence_parameters() - no internal source code callers, test coverage exists, test times on HEAD:96 ms PATCHED:98 ms ts_token_type_byid(), ts_token_type_byname(), ts_parse_byid() and ts_parse_byname() - internal source code callers exists, test coverage exists, test times on HEAD:195 ms, pg_dump 10 wallclock secs PATCHED:186 ms, pg_dump 10 wallclock secs aclexplode() - internal callers exists information_schema.sql, indirect test coverage exists. pg_timezone_abbrevs() - no internal source code callers, test coverage exists, test times on HEAD:40 ms PATCHED:36 ms pg_stat_file() - no internal source code callers, test coverage exists, test times on HEAD:42 ms PATCHED:46 ms pg_lock_status() - no internal source code callers, test coverage exists, test times on HEAD:16 ms PATCHED:22 ms pg_get_keywords() - no internal source code callers, test coverage exists, test times on HEAD:129 ms PATCHED:130 ms pg_get_catalog_foreign_keys() - no internal source code callers, test coverage exists, test times on HEAD:114 ms PATCHED:111 ms pg_partition_tree() - no internal source code callers, test coverage exists, test times on HEAD:30 ms PATCHED:32 ms pg_stat_get_wal(), pg_stat_get_archiver() and pg_stat_get_replication_slot() - no internal source code callers, test coverage exists, test times on HEAD:479 ms PATCHED:483 ms pg_stat_get_subscription_stats() - no internal source code callers, test coverage exists, test times on HEAD:subscription 108 wallclock secs PATCHED:subscription 110 wallclock secs tsvector_unnest() - no internal source code callers, test coverage exists, test times on HEAD:26 ms PATCHED:26 ms ts_setup_firstcall() - test coverage exists, test times on HEAD:195 ms PATCHED:186 ms show_all_settings(), pg_control_system(), pg_control_checkpoint(), pg_control_recovery() and pg_control_init() - test coverage exists, test times on HEAD:42 ms PATCHED:44 ms test_predtest() - no internal source code callers, test coverage exists, test times on HEAD:18 ms PATCHED:18 ms -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Use get_call_result_type() more widely
Bharath Rupireddy writes: > A review comment in another thread [1] by Michael Paquier about the > usage of get_call_result_type() instead of explicit building of > TupleDesc made me think about using it more widely. Actually, the > get_call_result_type() looks at the function definitions to figure the > column names and build the required TupleDesc, usage of which avoids > duplication of the column names between pg_proc.dat/function > definitions and source code. Also, it saves a good number of LOC ~415 > [2] and the size of all the object files put together gets reduced by > ~4MB, which means, the postgres binary becomes leaner by ~4MB [3]. Saving code is nice, but I'd assume the result is slower, because get_call_result_type has to do a pretty substantial amount of work to get the data to construct the tupdesc from. Have you tried to quantify how much overhead this'd add? Which of these functions can we safely consider to be non-performance-critical? regards, tom lane
Re: Use get_call_result_type() more widely
On Tue, Dec 13, 2022 at 01:06:48PM +0530, Bharath Rupireddy wrote: > A review comment in another thread [1] by Michael Paquier about the > usage of get_call_result_type() instead of explicit building of > TupleDesc made me think about using it more widely. Actually, the > get_call_result_type() looks at the function definitions to figure the > column names and build the required TupleDesc, usage of which avoids > duplication of the column names between pg_proc.dat/function > definitions and source code. Also, it saves a good number of LOC ~415 > [2] and the size of all the object files put together gets reduced by > ~4MB, which means, the postgres binary becomes leaner by ~4MB [3]. I'm > attaching a patch for these changes. I have wanted to look at that when poking at the interface for materialized SRFs but lacked of steam back then. Even after this change, we still have coverage for CreateTemplateTupleDesc() and TupleDescInitEntry() through the GUCs/SHOW or even WAL sender, so the coverage does not worry me much. Backpatch conflicts may be a point of contention, but that's pretty much in the same spirit as SetSingleFuncCall()/InitMaterializedSRF(). All in that, +1 (still need to check in details what you have here, looks rather fine at quick glance). -- Michael signature.asc Description: PGP signature
Use get_call_result_type() more widely
: [PATCH v1] Use get_call_result_type() more widely Usage of get_call_result_type() in more places avoids explicit creation of tuple descriptor which requires matching column names from pg_proc.dat/function definitions. It saves a good amount of LOC ~415 and the size of all the object files put together gets reduced by ~4MB, which means, the postgres binary can become leaner by ~4MB. --- contrib/old_snapshot/time_mapping.c | 26 + contrib/pg_visibility/pg_visibility.c | 6 +- src/backend/access/transam/commit_ts.c| 26 + src/backend/access/transam/multixact.c| 9 +- src/backend/access/transam/twophase.c | 17 +-- src/backend/access/transam/xlogfuncs.c| 21 +--- src/backend/catalog/objectaddress.c | 42 +-- src/backend/catalog/pg_publication.c | 13 +-- src/backend/commands/sequence.c | 19 +-- src/backend/tsearch/wparser.c | 29 +++-- src/backend/utils/adt/acl.c | 18 +-- src/backend/utils/adt/datetime.c | 17 +-- src/backend/utils/adt/genfile.c | 20 +--- src/backend/utils/adt/lockfuncs.c | 40 +-- src/backend/utils/adt/misc.c | 31 + src/backend/utils/adt/partitionfuncs.c| 14 +-- src/backend/utils/adt/pgstatfuncs.c | 78 ++--- src/backend/utils/adt/tsvector_op.c | 15 +-- src/backend/utils/misc/guc_funcs.c| 42 +-- src/backend/utils/misc/pg_controldata.c | 108 ++ .../modules/test_predtest/test_predtest.c | 20 +--- 21 files changed, 97 insertions(+), 514 deletions(-) diff --git a/contrib/old_snapshot/time_mapping.c b/contrib/old_snapshot/time_mapping.c index 2d8cb742c3..97efccddbb 100644 --- a/contrib/old_snapshot/time_mapping.c +++ b/contrib/old_snapshot/time_mapping.c @@ -38,7 +38,6 @@ PG_MODULE_MAGIC; PG_FUNCTION_INFO_V1(pg_old_snapshot_time_mapping); static OldSnapshotTimeMapping *GetOldSnapshotTimeMapping(void); -static TupleDesc MakeOldSnapshotTimeMappingTupleDesc(void); static HeapTuple MakeOldSnapshotTimeMappingTuple(TupleDesc tupdesc, OldSnapshotTimeMapping *mapping); @@ -54,12 +53,15 @@ pg_old_snapshot_time_mapping(PG_FUNCTION_ARGS) if (SRF_IS_FIRSTCALL()) { MemoryContext oldcontext; + TupleDesc tupdesc; funcctx = SRF_FIRSTCALL_INIT(); oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); mapping = GetOldSnapshotTimeMapping(); funcctx->user_fctx = mapping; - funcctx->tuple_desc = MakeOldSnapshotTimeMappingTupleDesc(); + if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + funcctx->tuple_desc = tupdesc; MemoryContextSwitchTo(oldcontext); } @@ -101,26 +103,6 @@ GetOldSnapshotTimeMapping(void) return mapping; } -/* - * Build a tuple descriptor for the pg_old_snapshot_time_mapping() SRF. - */ -static TupleDesc -MakeOldSnapshotTimeMappingTupleDesc(void) -{ - TupleDesc tupdesc; - - tupdesc = CreateTemplateTupleDesc(NUM_TIME_MAPPING_COLUMNS); - - TupleDescInitEntry(tupdesc, (AttrNumber) 1, "array_offset", - INT4OID, -1, 0); - TupleDescInitEntry(tupdesc, (AttrNumber) 2, "end_timestamp", - TIMESTAMPTZOID, -1, 0); - TupleDescInitEntry(tupdesc, (AttrNumber) 3, "newest_xmin", - XIDOID, -1, 0); - - return BlessTupleDesc(tupdesc); -} - /* * Convert one entry from the old snapshot time mapping to a HeapTuple. */ diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c index a95f73ec79..81f262a5f4 100644 --- a/contrib/pg_visibility/pg_visibility.c +++ b/contrib/pg_visibility/pg_visibility.c @@ -291,10 +291,8 @@ pg_visibility_map_summary(PG_FUNCTION_ARGS) ReleaseBuffer(vmbuffer); relation_close(rel, AccessShareLock); - tupdesc = CreateTemplateTupleDesc(2); - TupleDescInitEntry(tupdesc, (AttrNumber) 1, "all_visible", INT8OID, -1, 0); - TupleDescInitEntry(tupdesc, (AttrNumber) 2, "all_frozen", INT8OID, -1, 0); - tupdesc = BlessTupleDesc(tupdesc); + if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); values[0] = Int64GetDatum(all_visible); values[1] = Int64GetDatum(all_frozen); diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c index 9aa4675cb7..5c30de57ac 100644 --- a/src/backend/access/transam/commit_ts.c +++ b/src/backend/access/transam/commit_ts.c @@ -422,18 +422,8 @@ pg_last_committed_xact(PG_FUNCTION_ARGS) /* and construct a tuple with our data */ xid = GetLatestCommitTsData(, ); - /* - * Construct a tuple descriptor for the result row. This must match this - * function's pg_proc entry! - */ - tupdesc = CreateTemplateTupleDesc(3); - TupleDescInitEntry(tupdesc, (AttrNumber) 1, "xid", - XIDOID, -1, 0