Re: appendBinaryStringInfo stuff
On Fri, Feb 10, 2023 at 7:16 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 19.12.22 07:13, Peter Eisentraut wrote: > > Also, the argument type of appendBinaryStringInfo() is char *. There is > > some code that uses this function to assemble some kind of packed binary > > layout, which requires a bunch of casts because of this. I think > > functions taking binary data plus length should take void * instead, > > like memcpy() for example. > > I found a little follow-up for this one: Make the same change to > pq_sendbytes(), which is a thin wrapper around appendBinaryStringInfo(). > This would allow getting rid of further casts at call sites. > +1 Has all the benefits that 54a177a948b0a773c25c6737d1cc3cc49222a526 had. Passes make check-world.
Re: proposal: psql: psql variable BACKEND_PID
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed A small but helpful feature. The new status of this patch is: Ready for Committer
Re: proposal: psql: psql variable BACKEND_PID
> > >> >> Clearly, it is hard to write a regression test for an externally volatile >> value. `SELECT sign(:BACKEND_PID)` would technically do the job, if we're >> striving for completeness. >> > > I did simple test - :BACKEND_PID should be equal pg_backend_pid() > > Even better. > >> >> Do we want to change %p to pull from this variable and save the >> snprintf()? Not a measurable savings, more or a don't-repeat-yourself thing. >> > > I checked the code, and I don't think so. Current state is safer (I > think). The psql's variables are not protected, and I think, so is safer, > better to > read the value for prompt directly by usage of the libpq API instead read > the possibly "customized" variable. I see possible inconsistency, > but again, the same inconsistency can be for variables USER and DBNAME > too, and I am not able to say what is significantly better. Just prompt > shows > real value, and the related variable is +/- copy in connection time. > > I am not 100% sure in this area what is better, but the change requires > wider and more general discussion, and I don't think the benefits of > possible > change are enough. There are just two possible solutions - we can protect > some psql's variables (and it can do some compatibility issues), or we > need to accept, so the value in prompt can be fake. It is better to not > touch it :-). > I agree it is out of scope of this patch, but I like the idea of protected psql variables, and I doubt users are intentionally re-using these vars to any positive effect. The more likely case is that newer psql vars just happen to use the names chosen by somebody's script in the past. > > done > > > Everything passes. Docs look good. Test looks good.
Re: proposal: psql: psql variable BACKEND_PID
> > with doc and unsetting variable > > Regards > > Pavel > > Patch applies. Manually testing confirms that it works, at least for the connected state. I don't actually know how get psql to invoke DISCONNECT, so I killed the dev server and can confirm [81:14:57:01 EST] corey=# \echo :BACKEND_PID 81 [81:14:57:04 EST] corey=# select 1; FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. The connection to the server was lost. Attempting reset: Failed. Time: 1.554 ms [:15:02:31 EST] !> \echo :BACKEND_PID :BACKEND_PID Clearly, it is hard to write a regression test for an externally volatile value. `SELECT sign(:BACKEND_PID)` would technically do the job, if we're striving for completeness. The inability to easily DISCONNECT via psql, and the deleterious effect that would have on other regression tests tells me that we can leave that one untested. Notes: This effectively makes the %p prompt (which I use in the example above) the same as %:BACKEND_PID: and we may want to note that in the documentation. Do we want to change %p to pull from this variable and save the snprintf()? Not a measurable savings, more or a don't-repeat-yourself thing. In the varlistentry, I suggest we add "This variable is unset when the connection is lost." after "but can be changed or unset.
Re: proposal: psql: show current user in prompt
On Fri, Feb 3, 2023 at 9:56 AM Pavel Stehule wrote: > Hi > > one visitor of p2d2 (Prague PostgreSQL Developer Day) asked if it is > possible to show the current role in psql's prompt. I think it is not > possible, but fortunately (with some limits) almost all necessary work is > done, and the patch is short. > > In the assigned patch I implemented a new prompt placeholder %N, that > shows the current role name. > > (2023-02-03 15:52:28) postgres=# \set PROMPT1 '%n as %N at '%/%=%# > pavel as pavel at postgres=#set role to admin; > SET > pavel as admin at postgres=>set role to default; > SET > pavel as pavel at postgres=# > > Comments, notes are welcome. > > Regards > > Pavel > This patch is cluttered with the BACKEND_PID patch and some guc_tables.c stuff that I don't think is related. We'd have to document the %N. I think there is some value here for people who have to connect as several different users (tech support), and need a reminder-at-a-glance whether they are operating in the desired role. It may be helpful in audit documentation as well.
Re: proposal: psql: psql variable BACKEND_PID
On Fri, Feb 3, 2023 at 5:42 AM Pavel Stehule wrote: > Hi > > We can simply allow an access to backend process id thru psql variable. I > propose the name "BACKEND_PID". The advantages of usage are simple > accessibility by command \set, and less typing then using function > pg_backend_pid, because psql variables are supported by tab complete > routine. Implementation is very simple, because we can use the function > PQbackendPID. > > Comments, notes? > > Regards > > Pavel > Interesting, and probably useful. It needs a corresponding line in UnsyncVariables(): SetVariable(pset.vars, "BACKEND_PID", NULL); That will set the variable back to null when the connection goes away.
Re: Remove some useless casts to (void *)
On Thu, Feb 2, 2023 at 5:22 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > I have found that in some corners of the code some calls to standard C > functions are decorated with casts to (void *) for no reason, and this > code pattern then gets copied around. I have gone through and cleaned > this up a bit, in the attached patches. > > The involved functions are: repalloc, memcpy, memset, memmove, memcmp, > qsort, bsearch > > Also hash_search(), for which there was a historical reason (the > argument used to be char *), but not anymore. +1 All code is example code. Applies. Passes make check world.
Re: transition tables and UPDATE
> > > even uglier than what I already had. So yeah, I think it might be > useful if we had a way to inject a counter or something in there. > > This came up for me when I was experimenting with making the referential integrity triggers fire on statements rather than rows. Doing so has the potential to take a lot of the sting out of big deletes where the referencing column isn't indexed (thus resulting in N sequentials scans of the referencing table). If that were 1 statement then we'd get a single (still ugly) hash join, but it's an improvement. It has been suggested that the the overhead of forming the tuplestores of affected rows and reconstituting them into EphemerialNamedRelations could be made better by instead storing an array of old ctids and new ctids, which obviously would be in the same order, if we had a means of reconstituting those with just the columns needed for the check (and generating a fake ordinality column for your needs), that would be considerably lighter weight than the tuplestores, and it might make statement level triggers more useful all around.
Re: Add SHELL_EXIT_CODE to psql
> > > Unfortunately, there is a fail in FreeBSD > https://cirrus-ci.com/task/6466749487382528 > > Maybe, this patch is need to be rebased? > > That failure is in postgres_fdw, which this code doesn't touch. I'm not able to get to /tmp/cirrus-ci-build/build/testrun/postgres_fdw-running/regress/regression.out - It's not in either of the browseable zip files and the 2nd zip file isn't downloadable, so it's hard to see what's going on. I rebased, but there are no code differences. From e99c7b8aa3b81725773c03583bcff50a88a58a38 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Wed, 11 Jan 2023 17:27:23 -0500 Subject: [PATCH v8 1/3] Add PG_OS_TARGET environment variable to enable OS-specific changes to regression tests. --- src/test/regress/pg_regress.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 6cd5998b9d..bf0ada4560 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -594,6 +594,17 @@ initialize_environment(void) #endif } + /* + * Regression tests that invoke OS commands must occasionally use + * OS-specific syntax (example: NUL vs /dev/null). This env var allows + * those tests to adjust commands accordingly. + */ +#if defined(WIN32) + setenv("PG_OS_TARGET", "WIN32", 1); +#else + setenv("PG_OS_TARGET", "OTHER", 1); +#endif + /* * Set translation-related settings to English; otherwise psql will * produce translated messages and produce diffs. (XXX If we ever support -- 2.39.1 From 496a84ccf758ed34d2c9bff0e9b458bef5fa9e58 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Thu, 12 Jan 2023 23:04:31 -0500 Subject: [PATCH v8 2/3] Add wait_result_to_exit_code(). --- src/common/wait_error.c | 15 +++ src/include/port.h | 1 + 2 files changed, 16 insertions(+) diff --git a/src/common/wait_error.c b/src/common/wait_error.c index 4a3c3c61af..1eb7ff3fa2 100644 --- a/src/common/wait_error.c +++ b/src/common/wait_error.c @@ -127,3 +127,18 @@ wait_result_is_any_signal(int exit_status, bool include_command_not_found) return true; return false; } + +/* + * Return the POSIX exit code (0 to 255) that corresponds to the argument. + * The argument is a return code returned by wait(2) or waitpid(2), which + * also applies to pclose(3) and system(3). + */ +int +wait_result_to_exit_code(int exit_status) +{ + if (WIFEXITED(exit_status)) + return WEXITSTATUS(exit_status); + if (WIFSIGNALED(exit_status)) + return WTERMSIG(exit_status); + return 0; +} diff --git a/src/include/port.h b/src/include/port.h index e66193bed9..1b119a3c56 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -495,6 +495,7 @@ extern char *escape_single_quotes_ascii(const char *src); extern char *wait_result_to_str(int exitstatus); extern bool wait_result_is_signal(int exit_status, int signum); extern bool wait_result_is_any_signal(int exit_status, bool include_command_not_found); +extern int wait_result_to_exit_code(int exit_status); /* * Interfaces that we assume all Unix system have. We retain individual macros -- 2.39.1 From e3dc0f2117b52f9661a75e79097741ed6c0f3b20 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Mon, 23 Jan 2023 16:46:16 -0500 Subject: [PATCH v8 3/3] Add psql variables SHELL_ERROR and SHELL_EXIT_CODE which report the success or failure of the most recent psql OS-command executed via the \! command or `backticks`. These variables are roughly analogous to ERROR and SQLSTATE, but for OS level operations instead of SQL commands. --- doc/src/sgml/ref/psql-ref.sgml | 21 ++ src/bin/psql/command.c | 14 src/bin/psql/help.c| 4 src/bin/psql/psqlscanslash.l | 35 +- src/test/regress/expected/psql.out | 29 + src/test/regress/sql/psql.sql | 25 + 6 files changed, 122 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index dc6528dc11..3d8a7353b3 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -4264,6 +4264,27 @@ bar + + SHELL_ERROR + + + true if the last shell command failed, false if + it succeeded. This applies to shell commands invoked via either \! + or `. See also SHELL_EXIT_CODE. + + + + + + SHELL_EXIT_CODE + + + The exit code return by the last shell command, invoked via either \! or `. + 0 means no error. See also SHELL_ERROR. + + + + SHOW_ALL_RESULTS diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b5201edf55..ea79d4fe57 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -5032,6 +5032,20 @@ do_shell(const char *command) else result = system(command); + if (result == 0) + { +
Re: Add n_tup_newpage_upd to pg_stat table views
On Fri, Jan 27, 2023 at 6:55 PM Andres Freund wrote: > Hi, > > On 2023-01-27 18:23:39 -0500, Corey Huinker wrote: > > This patch adds the n_tup_newpage_upd to all the table stat views. > > > > Just as we currently track HOT updates, it should be beneficial to track > > updates where the new tuple cannot fit on the existing page and must go > to > > a different one. > > I like that idea. > > > I wonder if it's quite detailed enough - we can be forced to do out-of-page > updates because we actually are out of space, or because we reach the max > number of line pointers we allow in a page. HOT pruning can't remove dead > line > pointers, so that doesn't necessarily help. > I must be missing something, I only see the check for running out of space, not the check for exhausting line pointers. I agree dividing them would be interesting. > Similarly, it's a bit sad that we can't distinguish between the number of > potential-HOT out-of-page updates and the other out-of-page updates. But > that'd mean even more counters. > I wondered that too, but the combinations of "would have been HOT but not no space" and "key update suggested not-HOT but it was id=id so today's your lucky HOT" combinations started to get away from me. I wondered if there was interest in knowing if the tuple had to get TOASTed, and further wondered if we would be interested in the number of updates that had to wait on a lock. Again, more counters.
Add n_tup_newpage_upd to pg_stat table views
This patch adds the n_tup_newpage_upd to all the table stat views. Just as we currently track HOT updates, it should be beneficial to track updates where the new tuple cannot fit on the existing page and must go to a different one. Hopefully this can give users some insight as to whether their current fillfactor settings need to be adjusted. My chosen implementation replaces the hot-update boolean with an update_type which is currently a three-value enum. I favored that only slightly over adding a separate newpage-update boolean because the two events are mutually exclusive and fewer parameters is less overhead and one less assertion check. The relative wisdom of this choice may not come to light until we add a new measurement and see whether that new measurement overlaps either is-hot or is-new-page. From 55204b3d2f719f5dd8c308ea722606a40b3d09b8 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Fri, 27 Jan 2023 17:56:59 -0500 Subject: [PATCH v1] Add n_tup_newpage_upd to pg_stat_all_tables and pg_stat_xact_all_tables. This value reflects the number of tuples updated where the new tuple was placed on a different page than the previous version. --- doc/src/sgml/monitoring.sgml | 9 + src/backend/access/heap/heapam.c | 10 ++ src/backend/catalog/system_views.sql | 4 +++- src/backend/utils/activity/pgstat_relation.c | 8 ++-- src/backend/utils/adt/pgstatfuncs.c | 18 ++ src/include/catalog/pg_proc.dat | 9 + src/include/pgstat.h | 14 -- src/test/regress/expected/rules.out | 12 +--- 8 files changed, 72 insertions(+), 12 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 1756f1a4b6..f0291a21fb 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4523,6 +4523,15 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + + n_tup_newpage_upd bigint + + + Number of rows updated where the new row is on a different page than the previous version + + + n_live_tup bigint diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index e6024a980b..f5aa429a9a 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -3155,7 +3155,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, pagefree; bool have_tuple_lock = false; bool iscombo; - bool use_hot_update = false; + PgStat_HeapUpdateType update_type = PGSTAT_HEAPUPDATE_NON_HOT; + bool key_intact; bool all_visible_cleared = false; bool all_visible_cleared_new = false; @@ -3838,10 +3839,11 @@ l2: * changed. */ if (!bms_overlap(modified_attrs, hot_attrs)) - use_hot_update = true; + update_type = PGSTAT_HEAPUPDATE_HOT; } else { + update_type = PGSTAT_HEAPUPDATE_NEW_PAGE; /* Set a hint that the old page could use prune/defrag */ PageSetFull(page); } @@ -3875,7 +3877,7 @@ l2: */ PageSetPrunable(page, xid); - if (use_hot_update) + if (update_type == PGSTAT_HEAPUPDATE_HOT) { /* Mark the old tuple as HOT-updated */ HeapTupleSetHotUpdated(); @@ -3986,7 +3988,7 @@ l2: if (have_tuple_lock) UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode); - pgstat_count_heap_update(relation, use_hot_update); + pgstat_count_heap_update(relation, update_type); /* * If heaptup is a private copy, release it. Don't forget to copy t_self diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 8608e3fa5b..292a9b88b3 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -665,6 +665,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_tuples_updated(C.oid) AS n_tup_upd, pg_stat_get_tuples_deleted(C.oid) AS n_tup_del, pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd, +pg_stat_get_tuples_newpage_updated(C.oid) AS n_tup_newpage_upd, pg_stat_get_live_tuples(C.oid) AS n_live_tup, pg_stat_get_dead_tuples(C.oid) AS n_dead_tup, pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze, @@ -696,7 +697,8 @@ CREATE VIEW pg_stat_xact_all_tables AS pg_stat_get_xact_tuples_inserted(C.oid) AS n_tup_ins, pg_stat_get_xact_tuples_updated(C.oid) AS n_tup_upd, pg_stat_get_xact_tuples_deleted(C.oid) AS n_tup_del, -pg_stat_get_xact_tuples_hot_updated(C.oid) AS n_tup_hot_upd +pg_stat_get_xact_tuples_hot_updated(C.oid) AS n_tup_hot_upd, +pg_stat_get_xact_tuples_newpage_updated(C.oid) AS n_tup_newpage_upd FROM pg_class C LEFT JOIN pg_index I ON C.oid = I.indrelid LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) diff --git
Re: Add SHELL_EXIT_CODE to psql
> > Thanks! But CF bot still not happy. I think, we should address issues from > here https://cirrus-ci.com/task/5391002618298368 > Sure enough, exit codes are shell dependent...adjusted the tests to reflect that. From 237b892e5efe739bc8e75d4af30140520d445491 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Wed, 11 Jan 2023 17:27:23 -0500 Subject: [PATCH v7 1/3] Add PG_OS_TARGET environment variable to enable OS-specific changes to regression tests. --- src/test/regress/pg_regress.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 6cd5998b9d..bf0ada4560 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -594,6 +594,17 @@ initialize_environment(void) #endif } + /* + * Regression tests that invoke OS commands must occasionally use + * OS-specific syntax (example: NUL vs /dev/null). This env var allows + * those tests to adjust commands accordingly. + */ +#if defined(WIN32) + setenv("PG_OS_TARGET", "WIN32", 1); +#else + setenv("PG_OS_TARGET", "OTHER", 1); +#endif + /* * Set translation-related settings to English; otherwise psql will * produce translated messages and produce diffs. (XXX If we ever support -- 2.39.0 From 01b5d159e3bae3506b74eb22ec066601454fa442 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Thu, 12 Jan 2023 23:04:31 -0500 Subject: [PATCH v7 2/3] Add wait_result_to_exit_code(). --- src/common/wait_error.c | 15 +++ src/include/port.h | 1 + 2 files changed, 16 insertions(+) diff --git a/src/common/wait_error.c b/src/common/wait_error.c index 4a3c3c61af..1eb7ff3fa2 100644 --- a/src/common/wait_error.c +++ b/src/common/wait_error.c @@ -127,3 +127,18 @@ wait_result_is_any_signal(int exit_status, bool include_command_not_found) return true; return false; } + +/* + * Return the POSIX exit code (0 to 255) that corresponds to the argument. + * The argument is a return code returned by wait(2) or waitpid(2), which + * also applies to pclose(3) and system(3). + */ +int +wait_result_to_exit_code(int exit_status) +{ + if (WIFEXITED(exit_status)) + return WEXITSTATUS(exit_status); + if (WIFSIGNALED(exit_status)) + return WTERMSIG(exit_status); + return 0; +} diff --git a/src/include/port.h b/src/include/port.h index e66193bed9..1b119a3c56 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -495,6 +495,7 @@ extern char *escape_single_quotes_ascii(const char *src); extern char *wait_result_to_str(int exitstatus); extern bool wait_result_is_signal(int exit_status, int signum); extern bool wait_result_is_any_signal(int exit_status, bool include_command_not_found); +extern int wait_result_to_exit_code(int exit_status); /* * Interfaces that we assume all Unix system have. We retain individual macros -- 2.39.0 From bc936340ba99f9b7d2bbdb36a24f2fcb531d720c Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Mon, 23 Jan 2023 16:46:16 -0500 Subject: [PATCH v7 3/3] Add psql variables SHELL_ERROR and SHELL_EXIT_CODE which report the success or failure of the most recent psql OS-command executed via the \! command or `backticks`. These variables are roughly analogous to ERROR and SQLSTATE, but for OS level operations instead of SQL commands. --- doc/src/sgml/ref/psql-ref.sgml | 21 ++ src/bin/psql/command.c | 14 src/bin/psql/help.c| 4 src/bin/psql/psqlscanslash.l | 35 +- src/test/regress/expected/psql.out | 29 + src/test/regress/sql/psql.sql | 25 + 6 files changed, 122 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index dc6528dc11..3d8a7353b3 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -4264,6 +4264,27 @@ bar + + SHELL_ERROR + + + true if the last shell command failed, false if + it succeeded. This applies to shell commands invoked via either \! + or `. See also SHELL_EXIT_CODE. + + + + + + SHELL_EXIT_CODE + + + The exit code return by the last shell command, invoked via either \! or `. + 0 means no error. See also SHELL_ERROR. + + + + SHOW_ALL_RESULTS diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b5201edf55..ea79d4fe57 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -5032,6 +5032,20 @@ do_shell(const char *command) else result = system(command); + if (result == 0) + { + SetVariable(pset.vars, "SHELL_EXIT_CODE", "0"); + SetVariable(pset.vars, "SHELL_ERROR", "false"); + } + else + { + int exit_code = wait_result_to_exit_code(result); + char buf[32]; + snprintf(buf, sizeof(buf), "%d", exit_code); + SetVariable(pset.vars,
Re: Add SHELL_EXIT_CODE to psql
On Mon, Jan 23, 2023 at 2:53 PM Robert Haas wrote: > On Mon, Jan 23, 2023 at 1:59 PM Corey Huinker > wrote: > > SHELL_ERROR is helpful in that it is a ready-made boolean that works for > \if tests in the same way that ERROR is set to true any time SQLSTATE is > nonzero. We don't yet have inline expressions for \if so the ready-made > boolean is a convenience. > > Oh, that seems a bit sad, but I guess it makes sense. > I agree, but there hasn't been much appetite for deciding what expressions would look like, or how we'd implement it. My instinct would be to not create our own expression engine, but instead integrate one that is already available. For my needs, the Unix `expr` command would be ideal (compares numbers and strings, can do regexes, can do complex expressions), but it's not cross platform.
Re: Add SHELL_EXIT_CODE to psql
On Fri, Jan 20, 2023 at 8:54 AM Robert Haas wrote: > On Wed, Jan 4, 2023 at 2:09 AM Corey Huinker > wrote: > > 2. There are now two psql variables, SHELL_EXIT_CODE, which has the > return code, and SHELL_ERROR, which is a true/false flag based on whether > the exit code was nonzero. These variables are the shell result analogues > of SQLSTATE and ERROR. > > Seems redundant. > SHELL_ERROR is helpful in that it is a ready-made boolean that works for \if tests in the same way that ERROR is set to true any time SQLSTATE is nonzero. We don't yet have inline expressions for \if so the ready-made boolean is a convenience. Or are you suggesting that I should have just set ERROR itself rather than create SHELL_ERROR?
Re: Add SHELL_EXIT_CODE to psql
> > I belive, we need proper includes. > Given that wait_error.c already seems to have the right includes worked out for WEXITSTATUS/WIFSTOPPED/etc, I decided to just add a function there. I named it wait_result_to_exit_code(), but I welcome suggestions of a better name. From 9e2827a6f955e7cebf87ca538fab113a359951b4 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Wed, 11 Jan 2023 17:27:23 -0500 Subject: [PATCH v6 1/3] Add PG_OS_TARGET environment variable to enable OS-specific changes to regression tests. --- src/test/regress/pg_regress.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 6cd5998b9d..bf0ada4560 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -594,6 +594,17 @@ initialize_environment(void) #endif } + /* + * Regression tests that invoke OS commands must occasionally use + * OS-specific syntax (example: NUL vs /dev/null). This env var allows + * those tests to adjust commands accordingly. + */ +#if defined(WIN32) + setenv("PG_OS_TARGET", "WIN32", 1); +#else + setenv("PG_OS_TARGET", "OTHER", 1); +#endif + /* * Set translation-related settings to English; otherwise psql will * produce translated messages and produce diffs. (XXX If we ever support -- 2.39.0 From 2c73156874bd0c222f414a2988271c4870d4293b Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Thu, 12 Jan 2023 23:04:31 -0500 Subject: [PATCH v6 2/3] Add wait_result_to_exit_code(). --- src/common/wait_error.c | 15 +++ src/include/port.h | 1 + 2 files changed, 16 insertions(+) diff --git a/src/common/wait_error.c b/src/common/wait_error.c index 4a3c3c61af..1eb7ff3fa2 100644 --- a/src/common/wait_error.c +++ b/src/common/wait_error.c @@ -127,3 +127,18 @@ wait_result_is_any_signal(int exit_status, bool include_command_not_found) return true; return false; } + +/* + * Return the POSIX exit code (0 to 255) that corresponds to the argument. + * The argument is a return code returned by wait(2) or waitpid(2), which + * also applies to pclose(3) and system(3). + */ +int +wait_result_to_exit_code(int exit_status) +{ + if (WIFEXITED(exit_status)) + return WEXITSTATUS(exit_status); + if (WIFSIGNALED(exit_status)) + return WTERMSIG(exit_status); + return 0; +} diff --git a/src/include/port.h b/src/include/port.h index e66193bed9..1b119a3c56 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -495,6 +495,7 @@ extern char *escape_single_quotes_ascii(const char *src); extern char *wait_result_to_str(int exitstatus); extern bool wait_result_is_signal(int exit_status, int signum); extern bool wait_result_is_any_signal(int exit_status, bool include_command_not_found); +extern int wait_result_to_exit_code(int exit_status); /* * Interfaces that we assume all Unix system have. We retain individual macros -- 2.39.0 From 9a969b50eec614cc181f020e25bd8dd00e7802f8 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Thu, 12 Jan 2023 23:28:14 -0500 Subject: [PATCH v6 3/3] Add psql variables SHELL_ERROR and SHELL_EXIT_CODE which report the success or failure of the most recent psql OS-command executed via the \! command or `backticks`. These variables are roughly analogous to ERROR and SQLSTATE, but for OS level operations instead of SQL commands. --- doc/src/sgml/ref/psql-ref.sgml | 21 ++ src/bin/psql/command.c | 14 src/bin/psql/help.c| 4 src/bin/psql/psqlscanslash.l | 35 +- src/test/regress/expected/psql.out | 25 + src/test/regress/sql/psql.sql | 21 ++ 6 files changed, 114 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index dc6528dc11..3d8a7353b3 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -4264,6 +4264,27 @@ bar + + SHELL_ERROR + + + true if the last shell command failed, false if + it succeeded. This applies to shell commands invoked via either \! + or `. See also SHELL_EXIT_CODE. + + + + + + SHELL_EXIT_CODE + + + The exit code return by the last shell command, invoked via either \! or `. + 0 means no error. See also SHELL_ERROR. + + + + SHOW_ALL_RESULTS diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b5201edf55..ea79d4fe57 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -5032,6 +5032,20 @@ do_shell(const char *command) else result = system(command); + if (result == 0) + { + SetVariable(pset.vars, "SHELL_EXIT_CODE", "0"); + SetVariable(pset.vars, "SHELL_ERROR", "false"); + } + else + { + int exit_code = wait_result_to_exit_code(result); + char buf[32]; + snprintf(buf, sizeof(buf),
Re: Add SHELL_EXIT_CODE to psql
> > > > The patch does not apply on top of HEAD as in [1], please post a rebased > patch: > > Conflict was due to the doc patch applying id tags to psql variable names. I've rebased and added my own id tags to the two new variables. From 9e2827a6f955e7cebf87ca538fab113a359951b4 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Wed, 11 Jan 2023 17:27:23 -0500 Subject: [PATCH v5 1/2] Add PG_OS_TARGET environment variable to enable OS-specific changes to regression tests. --- src/test/regress/pg_regress.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 6cd5998b9d..bf0ada4560 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -594,6 +594,17 @@ initialize_environment(void) #endif } + /* + * Regression tests that invoke OS commands must occasionally use + * OS-specific syntax (example: NUL vs /dev/null). This env var allows + * those tests to adjust commands accordingly. + */ +#if defined(WIN32) + setenv("PG_OS_TARGET", "WIN32", 1); +#else + setenv("PG_OS_TARGET", "OTHER", 1); +#endif + /* * Set translation-related settings to English; otherwise psql will * produce translated messages and produce diffs. (XXX If we ever support -- 2.39.0 From 4ed7ef0a21a541277250641c63a7bceea4c1ef62 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Wed, 11 Jan 2023 17:28:17 -0500 Subject: [PATCH v5 2/2] Add psql variables SHELL_ERROR and SHELL_EXIT_CODE which report the success or failure of the most recent psql OS-command executed via the \! command or `backticks`. These variables are roughly analogous to ERROR and SQLSTATE, but for OS level operations instead of SQL commands. --- doc/src/sgml/ref/psql-ref.sgml | 21 + src/bin/psql/command.c | 8 +++ src/bin/psql/help.c| 4 src/bin/psql/psqlscanslash.l | 38 ++ src/test/regress/expected/psql.out | 25 src/test/regress/sql/psql.sql | 21 + 6 files changed, 112 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index dc6528dc11..3d8a7353b3 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -4264,6 +4264,27 @@ bar + + SHELL_ERROR + + + true if the last shell command failed, false if + it succeeded. This applies to shell commands invoked via either \! + or `. See also SHELL_EXIT_CODE. + + + + + + SHELL_EXIT_CODE + + + The exit code return by the last shell command, invoked via either \! or `. + 0 means no error. See also SHELL_ERROR. + + + + SHOW_ALL_RESULTS diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b5201edf55..5d15c2143b 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -5005,6 +5005,7 @@ static bool do_shell(const char *command) { int result; + char exit_code_buf[32]; fflush(NULL); if (!command) @@ -5032,6 +5033,13 @@ do_shell(const char *command) else result = system(command); + snprintf(exit_code_buf, sizeof(exit_code_buf), "%d", WEXITSTATUS(result)); + SetVariable(pset.vars, "SHELL_EXIT_CODE", exit_code_buf); + if (result == 0) + SetVariable(pset.vars, "SHELL_ERROR", "false"); + else + SetVariable(pset.vars, "SHELL_ERROR", "true"); + if (result == 127 || result == -1) { pg_log_error("\\!: failed"); diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index e45c4aaca5..6d5226f793 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -455,6 +455,10 @@ helpVariables(unsigned short int pager) "show all results of a combined query (\\;) instead of only the last\n"); HELP0(" SHOW_CONTEXT\n" "controls display of message context fields [never, errors, always]\n"); + HELP0(" SHELL_ERROR\n" + "true if last shell command failed, else false\n"); + HELP0(" SHELL_EXIT_CODE\n" + "Exit code of the last shell command\n"); HELP0(" SINGLELINE\n" "if set, end of line terminates SQL commands (same as -S option)\n"); HELP0(" SINGLESTEP\n" diff --git a/src/bin/psql/psqlscanslash.l b/src/bin/psql/psqlscanslash.l index 8449ee1a52..f6f03bd816 100644 --- a/src/bin/psql/psqlscanslash.l +++ b/src/bin/psql/psqlscanslash.l @@ -23,6 +23,9 @@ #include "fe_utils/conditional.h" #include "libpq-fe.h" +#include "settings.h" +#include "variables.h" +#include } %{ @@ -774,6 +777,8 @@ evaluate_backtick(PsqlScanState state) bool error = false; char buf[512]; size_t result; + int exit_code = 0; + char exit_code_buf[32]; initPQExpBuffer(_output); @@ -783,6 +788,7 @@ evaluate_backtick(PsqlScanState state) { pg_log_error("%s: %m", cmd); error = true; + exit_code = -1; } if (!error) @@ -790,7
Re: Add SHELL_EXIT_CODE to psql
On Tue, Jan 10, 2023 at 3:54 AM Maxim Orlov wrote: > > > On Mon, 9 Jan 2023 at 21:36, Corey Huinker > wrote: > >> >> I chose a name that would avoid collisions with anything a user might >> potentially throw into their environment, so if the var "OS" is fairly >> standard is a reason to avoid using it. Also, going with our own env var >> allows us to stay in perfect synchronization with the build's #ifdef WIN32 >> ... and whatever #ifdefs may come in the future for new OSes. If there is >> already an environment variable that does that for us, I would rather use >> that, but I haven't found it. >> >> Perhaps, I didn't make myself clear. Your solution is perfectly adapted > to our needs. > But all Windows since 2000 already have an environment variable > OS=Windows_NT. So, if env OS is defined and equal Windows_NT, this had to > be Windows. > May we use it in our case? I don't insist, just asking. > Ah, that makes more sense. I don't have a strong opinion on which we should use, and I would probably defer to someone who does windows (and possibly cygwin) builds more often than I do.
Re: Add SHELL_EXIT_CODE to psql
On Mon, Jan 9, 2023 at 10:01 AM Maxim Orlov wrote: > Hi! > > In overall, I think we move in the right direction. But we could make code > better, should we? > > + /* Capture exit code for SHELL_EXIT_CODE */ > + close_exit_code = pclose(fd); > + if (close_exit_code == -1) > + { > + pg_log_error("%s: %m", cmd); > + error = true; > + } > + if (WIFEXITED(close_exit_code)) > + exit_code=WEXITSTATUS(close_exit_code); > + else if(WIFSIGNALED(close_exit_code)) > + exit_code=WTERMSIG(close_exit_code); > + else if(WIFSTOPPED(close_exit_code)) > + exit_code=WSTOPSIG(close_exit_code); > + if (exit_code) > + error = true; > I think, it's better to add spaces around middle if block. It will be easy > to read. > Also, consider, adding spaces around assignment in this block. > Noted and will implement. > + /* > + snprintf(exit_code_buf, sizeof(exit_code_buf), "%d", > WEXITSTATUS(exit_code)); > + */ > Probably, this is not needed. > Heh. Oops. > > 1. pg_regress now creates an environment variable called PG_OS_TARGET > Maybe, we can use env "OS"? I do not know much about Windows, but I think > this is kind of standard environment variable there. > I chose a name that would avoid collisions with anything a user might potentially throw into their environment, so if the var "OS" is fairly standard is a reason to avoid using it. Also, going with our own env var allows us to stay in perfect synchronization with the build's #ifdef WIN32 ... and whatever #ifdefs may come in the future for new OSes. If there is already an environment variable that does that for us, I would rather use that, but I haven't found it. The 0001 patch is unchanged from last time (aside from anything rebasing might have done). > From cb2ac7393cf0ce0a7c336dfc749ff91142a88b09 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Tue, 3 Jan 2023 22:16:20 -0500 Subject: [PATCH v4 1/2] Add PG_OS_TARGET environment variable to enable OS-specific changes to regression tests. --- src/test/regress/pg_regress.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 40e6c231a3..56c59e0b10 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -595,6 +595,17 @@ initialize_environment(void) #endif } + /* + * Regression tests that invoke OS commands must occasionally use + * OS-specific syntax (example: NUL vs /dev/null). This env var allows + * those tests to adjust commands accordingly. + */ +#if defined(WIN32) + setenv("PG_OS_TARGET", "WIN32", 1); +#else + setenv("PG_OS_TARGET", "OTHER", 1); +#endif + /* * Set translation-related settings to English; otherwise psql will * produce translated messages and produce diffs. (XXX If we ever support -- 2.39.0 From 9d4bec9e046a80f89ce484e2dc4f25c8f1d99e9c Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Mon, 9 Jan 2023 13:25:08 -0500 Subject: [PATCH v4 2/2] Add psql variables SHELL_ERROR and SHELL_EXIT_CODE which report the success or failure of the most recent psql OS-command executed via the \! command or `backticks`. These variables are roughly analogous to ERROR and SQLSTATE, but for OS level operations instead of SQL commands. --- doc/src/sgml/ref/psql-ref.sgml | 21 + src/bin/psql/command.c | 8 +++ src/bin/psql/help.c| 4 src/bin/psql/psqlscanslash.l | 38 ++ src/test/regress/expected/psql.out | 25 src/test/regress/sql/psql.sql | 21 + 6 files changed, 112 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 3f994a3592..711a6d24fd 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -4264,6 +4264,27 @@ bar + + SHELL_ERROR + + + true if the last shell command failed, false if + it succeeded. This applies to shell commands invoked via either \! + or `. See also SHELL_EXIT_CODE. + + + + + + SHELL_EXIT_CODE + + + The exit code return by the last shell command, invoked via either \! or `. + 0 means no error. See also SHELL_ERROR. + + + + SHOW_ALL_RESULTS diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b5201edf55..5d15c2143b 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -5005,6 +5005,7 @@ static bool do_shell(const char *command) { int result; + char exit_code_buf[32]; fflush(NULL); if (!command) @@ -5032,6 +5033,13 @@ do_shell(const char *command)
Re: Generate pg_stat_get_xact*() functions with Macros
On Thu, Jan 5, 2023 at 8:50 AM Drouvot, Bertrand < bertranddrouvot...@gmail.com> wrote: > Hi hackers, > > Please find attached a patch proposal to $SUBJECT. > > This is the same kind of work that has been done in 83a1a1b566 and > 8018ffbf58 but this time for the > pg_stat_get_xact*() functions (as suggested by Andres in [1]). > > The function names remain the same, but some fields have to be renamed. > > While at it, I also took the opportunity to create the macros for > pg_stat_get_xact_function_total_time(), > pg_stat_get_xact_function_self_time() and > pg_stat_get_function_total_time(), pg_stat_get_function_self_time() > (even if the same code pattern is only repeated two 2 times). > > Now that this patch renames some fields, I think that, for consistency, > those ones should be renamed too (aka remove the f_ and t_ prefixes): > > PgStat_FunctionCounts.f_numcalls > PgStat_StatFuncEntry.f_numcalls > PgStat_TableCounts.t_truncdropped > PgStat_TableCounts.t_delta_live_tuples > PgStat_TableCounts.t_delta_dead_tuples > PgStat_TableCounts.t_changed_tuples > > But I think it would be better to do it in a follow-up patch (once this > one get committed). > > [1]: > https://www.postgresql.org/message-id/20230105002733.ealhzubjaiqis6ua%40awork3.anarazel.de > > Looking forward to your feedback, > > Regards, > > -- > Bertrand Drouvot > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com I like code cleanups like this. It makes sense, it results in less code, and anyone doing a `git grep pg_stat_get_live_tuples` will quickly find the macro definition. Unsurprisingly, it passes `make check-world`. So I think it's good to go as-is. It does get me wondering, however, if we reordered the three typedefs to group like-typed registers together, we could make them an array with the names becoming defined constant index values (or keeping them via a union), then the typedefs effectively become: typedef struct PgStat_FunctionCallUsage { PgStat_FunctionCounts *fs; instr_time time_counters[3]; } PgStat_FunctionCallUsage; typedef struct PgStat_BackendSubEntry { PgStat_Counter counters[2]; } PgStat_BackendSubEntry; typedef struct PgStat_TableCounts { boolt_truncdropped; PgStat_Counter counters[12]; } PgStat_TableCounts; Then we'd only have 3 actual C functions: pg_stat_get_xact_counter(oid, int) pg_stat_get_xact_subtrans_counter(oid, int) pg_stat_get_xact_function_time_counter(oid, int) and then the existing functions become SQL standard function body calls, something like this: CREATE OR REPLACE FUNCTION pg_stat_get_xact_numscans(oid) RETURNS bigint LANGUAGE sql STABLE PARALLEL RESTRICTED COST 1 RETURN pg_stat_get_xact_counter($1, 0); CREATE OR REPLACE FUNCTION pg_stat_get_xact_tuples_returned(oid) RETURNS bigint LANGUAGE sql STABLE PARALLEL RESTRICTED COST 1 RETURN pg_stat_get_xact_counter($1, 1); The most obvious drawback to this approach is that the C functions would need to do runtime bounds checking on the index parameter, and the amount of memory footprint saved by going from 17 short functions to 3 is not enough to make any real difference. So I think your approach is better, but I wanted to throw this idea out there.
Re: Add SHELL_EXIT_CODE to psql
On Tue, Jan 3, 2023 at 5:36 AM vignesh C wrote: > On Wed, 21 Dec 2022 at 11:04, Corey Huinker > wrote: > > > > I've rebased and updated the patch to include documentation. > > > > Regression tests have been moved to a separate patchfile because error > messages will vary by OS and configuration, so we probably can't do a > stable regression test, but having them handy at least demonstrates the > feature. > > > > On Sun, Dec 4, 2022 at 12:35 AM Corey Huinker > wrote: > >> > >> Rebased. Still waiting on feedback before working on documentation. > >> > >> On Fri, Nov 4, 2022 at 5:23 AM Corey Huinker > wrote: > >>> > >>> Oops, that sample output was from a previous run, should have been: > >>> > >>> -- SHELL_EXIT_CODE is undefined > >>> \echo :SHELL_EXIT_CODE > >>> :SHELL_EXIT_CODE > >>> -- bad \! > >>> \! borp > >>> sh: line 1: borp: command not found > >>> \echo :SHELL_EXIT_CODE > >>> 127 > >>> -- bad backtick > >>> \set var `borp` > >>> sh: line 1: borp: command not found > >>> \echo :SHELL_EXIT_CODE > >>> 127 > >>> -- good \! > >>> \! true > >>> \echo :SHELL_EXIT_CODE > >>> 0 > >>> -- play with exit codes > >>> \! exit 4 > >>> \echo :SHELL_EXIT_CODE > >>> 4 > >>> \set var `exit 3` > >>> \echo :SHELL_EXIT_CODE > >>> 3 > >>> > >>> > >>> On Fri, Nov 4, 2022 at 5:08 AM Corey Huinker > wrote: > >>>> > >>>> > >>>> Over in > https://www.postgresql.org/message-id/eaf326ad693e74eba068f33a7f518...@oss.nttdata.com > Justin Pryzby suggested that psql might need the ability to capture the > shell exit code. > >>>> > >>>> This is a POC patch that does that, but doesn't touch on the > ON_ERROR_STOP stuff. > >>>> > >>>> I've added some very rudimentary tests, but haven't touched the > documentation, because I strongly suspect that someone will suggest a > better name for the variable. > >>>> > >>>> But basically, it works like this > >>>> > >>>> -- SHELL_EXIT_CODE is undefined > >>>> \echo :SHELL_EXIT_CODE > >>>> :SHELL_EXIT_CODE > >>>> -- bad \! > >>>> \! borp > >>>> sh: line 1: borp: command not found > >>>> \echo :SHELL_EXIT_CODE > >>>> 32512 > >>>> -- bad backtick > >>>> \set var `borp` > >>>> sh: line 1: borp: command not found > >>>> \echo :SHELL_EXIT_CODE > >>>> 127 > >>>> -- good \! > >>>> \! true > >>>> \echo :SHELL_EXIT_CODE > >>>> 0 > >>>> -- play with exit codes > >>>> \! exit 4 > >>>> \echo :SHELL_EXIT_CODE > >>>> 1024 > >>>> \set var `exit 3` > >>>> \echo :SHELL_EXIT_CODE > >>>> 3 > >>>> > >>>> > >>>> Feedback welcome. > > CFBot shows some compilation errors as in [1], please post an updated > version for the same: > [02:35:49.924] psqlscanslash.l: In function ‘evaluate_backtick’: > [02:35:49.924] psqlscanslash.l:822:11: error: implicit declaration of > function ‘WIFSTOPPED’ [-Werror=implicit-function-declaration] > [02:35:49.924] 822 | exit_code=WSTOPSIG(exit_code); > [02:35:49.924] | ^~ > [02:35:49.924] psqlscanslash.l:823:14: error: implicit declaration of > function ‘WSTOPSIG’ [-Werror=implicit-function-declaration] > [02:35:49.924] 823 | } > [02:35:49.924] | ^ > [02:35:49.924] cc1: all warnings being treated as errors > [02:35:49.925] make[3]: *** [: psqlscanslash.o] Error 1 > > [1] - https://cirrus-ci.com/task/5424476720988160 > > Regards, > Vignesh > Thanks. I had left sys/wait.h out of psqlscanslash. Attached is v3 of this patch, I've made the following changes: 1. pg_regress now creates an environment variable called PG_OS_TARGET, which regression tests can use to manufacture os-specific commands. For our purposes, this allows the regression test to manufacture a shell command that has either "2> /dev/null" or "2> NUL". This seemed the least invasive way to make this possible. If for some reason it becomes useful in general psql scripting, then we can consider promoting it to a regular psql var. 2. There are now two psql variables, SHELL_EXIT_CODE, which has the return code, and SHE
Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions
On Mon, Jan 2, 2023 at 10:57 AM Tom Lane wrote: > Corey Huinker writes: > > The proposed changes are as follows: > > CAST(expr AS typename) > > continues to behave as before. > > CAST(expr AS typename ERROR ON ERROR) > > has the identical behavior as the unadorned CAST() above. > > CAST(expr AS typename NULL ON ERROR) > > will use error-safe functions to do the cast of expr, and will return > > NULL if the cast fails. > > CAST(expr AS typename DEFAULT expr2 ON ERROR) > > will use error-safe functions to do the cast of expr, and will return > > expr2 if the cast fails. > > While I approve of trying to get some functionality in this area, > I'm not sure that extending CAST is a great idea, because I'm afraid > that the SQL committee will do something that conflicts with it. > If we know that they are about to standardize exactly this syntax, > where is that information available? If we don't know that, > I'd prefer to invent some kind of function or other instead of > extending the grammar. > > regards, tom lane > I'm going off the spec that Vik presented in https://www.postgresql.org/message-id/f8600a3b-f697-2577-8fea-f40d3e18b...@postgresfriends.org which is his effort to get it through the SQL committee. I was alreading thinking about how to get the SQLServer TRY_CAST() function into postgres, so this seemed like the logical next step. While the syntax may change, the underlying infrastructure would remain basically the same: we would need the ability to detect that a typecast had failed, and replace it with the default value, and handle that at parse time, or executor time, and handle array casts where the array has the default but the underlying elements can't. It would be simple to move the grammar changes to their own patch if that removes a barrier for people.
Re: Add SHELL_EXIT_CODE to psql
On Sat, Dec 31, 2022 at 5:28 PM Isaac Morland wrote: > On Sat, 31 Dec 2022 at 16:47, Corey Huinker > wrote: > >> >>> I wonder if there is value in setting up a psql on/off var >>> SHELL_ERROR_OUTPUT construct that when set to "off/false" >>> suppresses standard error via appending "2> /dev/null" (or "2> nul" if >>> #ifdef WIN32). At the very least, it would allow for tests like this to be >>> done with standard regression scripts. >>> >> >> Thinking on this some more a few ideas came up: >> >> 1. The SHELL_ERROR_OUTPUT var above. This is good for simple situations, >> but it would fail if the user took it upon themselves to redirect output, >> and suddenly "myprog 2> /dev/null" works fine on its own but will fail when >> we append our own "2> /dev/null" to it. >> > > Rather than attempting to append redirection directives to the command, > would it work to redirect stderr before invoking the shell? This seems to > me to be more reliable and it should allow an explicit redirection in the > shell command to still work. The difference between Windows and Unix then > becomes the details of what system calls we use to accomplish the > redirection (or maybe none, if an existing abstraction layer takes care of > that - I'm not really up on Postgres internals enough to know), rather than > what we append to the provided command. > > Inside psql, it's a call to the system() function which takes a single string. All output, stdout or stderr, is printed. So for the regression test we have to compose a command that is OS appropriate AND suppresses stderr.
Re: Add SHELL_EXIT_CODE to psql
On Fri, Dec 30, 2022 at 2:17 PM Corey Huinker wrote: > On Wed, Dec 28, 2022 at 5:59 AM Maxim Orlov wrote: > >> Hi! >> >> The patch is implementing what is declared to do. Shell return code is >> now accessible is psql var. >> Overall code is in a good condition. Applies with no errors on master. >> Unfortunately, regression tests are failing on the macOS due to the >> different shell output. >> > > That was to be expected. > > I wonder if there is value in setting up a psql on/off var > SHELL_ERROR_OUTPUT construct that when set to "off/false" > suppresses standard error via appending "2> /dev/null" (or "2> nul" if > #ifdef WIN32). At the very least, it would allow for tests like this to be > done with standard regression scripts. > Thinking on this some more a few ideas came up: 1. The SHELL_ERROR_OUTPUT var above. This is good for simple situations, but it would fail if the user took it upon themselves to redirect output, and suddenly "myprog 2> /dev/null" works fine on its own but will fail when we append our own "2> /dev/null" to it. 2. Exposing a PSQL var that identifies the shell snippet for "how to suppress standard error", which would be "2> /dev/null" for -nix systems and "2> NUL" for windows systems. This again, presumes that users will actually need this feature, and I'm not convinced that they will. 3. Exposing a PSQL var that is basically an "is this environment running on windows" and let them construct it from there. That gets complicated with WSL and the like, so I'm not wild about this, even though it would be comparatively simple to implement. 4. We could inject an environment variable into our own regression tests directly based on the "#ifdef WIN32" in our own code, and we can use a couple of \gset commands to construct the error-suppressed shell commands as needed. This seems like the best starting point, and we can always turn that env var into a real variable if it proves useful elsewhere.
Re: Add SHELL_EXIT_CODE to psql
On Wed, Dec 28, 2022 at 5:59 AM Maxim Orlov wrote: > Hi! > > The patch is implementing what is declared to do. Shell return code is now > accessible is psql var. > Overall code is in a good condition. Applies with no errors on master. > Unfortunately, regression tests are failing on the macOS due to the > different shell output. > That was to be expected. I wonder if there is value in setting up a psql on/off var SHELL_ERROR_OUTPUT construct that when set to "off/false" suppresses standard error via appending "2> /dev/null" (or "2> nul" if #ifdef WIN32). At the very least, it would allow for tests like this to be done with standard regression scripts. >
Re: Add SHELL_EXIT_CODE to psql
I've rebased and updated the patch to include documentation. Regression tests have been moved to a separate patchfile because error messages will vary by OS and configuration, so we probably can't do a stable regression test, but having them handy at least demonstrates the feature. On Sun, Dec 4, 2022 at 12:35 AM Corey Huinker wrote: > Rebased. Still waiting on feedback before working on documentation. > > On Fri, Nov 4, 2022 at 5:23 AM Corey Huinker > wrote: > >> Oops, that sample output was from a previous run, should have been: >> >> -- SHELL_EXIT_CODE is undefined >> \echo :SHELL_EXIT_CODE >> :SHELL_EXIT_CODE >> -- bad \! >> \! borp >> sh: line 1: borp: command not found >> \echo :SHELL_EXIT_CODE >> 127 >> -- bad backtick >> \set var `borp` >> sh: line 1: borp: command not found >> \echo :SHELL_EXIT_CODE >> 127 >> -- good \! >> \! true >> \echo :SHELL_EXIT_CODE >> 0 >> -- play with exit codes >> \! exit 4 >> \echo :SHELL_EXIT_CODE >> 4 >> \set var `exit 3` >> \echo :SHELL_EXIT_CODE >> 3 >> >> >> On Fri, Nov 4, 2022 at 5:08 AM Corey Huinker >> wrote: >> >>> >>> Over in >>> https://www.postgresql.org/message-id/eaf326ad693e74eba068f33a7f518...@oss.nttdata.com >>> Justin >>> Pryzby suggested that psql might need the ability to capture the shell exit >>> code. >>> >>> This is a POC patch that does that, but doesn't touch on the >>> ON_ERROR_STOP stuff. >>> >>> I've added some very rudimentary tests, but haven't touched the >>> documentation, because I strongly suspect that someone will suggest a >>> better name for the variable. >>> >>> But basically, it works like this >>> >>> -- SHELL_EXIT_CODE is undefined >>> \echo :SHELL_EXIT_CODE >>> :SHELL_EXIT_CODE >>> -- bad \! >>> \! borp >>> sh: line 1: borp: command not found >>> \echo :SHELL_EXIT_CODE >>> 32512 >>> -- bad backtick >>> \set var `borp` >>> sh: line 1: borp: command not found >>> \echo :SHELL_EXIT_CODE >>> 127 >>> -- good \! >>> \! true >>> \echo :SHELL_EXIT_CODE >>> 0 >>> -- play with exit codes >>> \! exit 4 >>> \echo :SHELL_EXIT_CODE >>> 1024 >>> \set var `exit 3` >>> \echo :SHELL_EXIT_CODE >>> 3 >>> >>> >>> Feedback welcome. >>> >> From 443be6f64ca89bbd6367a011f2a98aa9324f27a9 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Wed, 21 Dec 2022 00:24:47 -0500 Subject: [PATCH 1/2] Make the exit code of shell commands executed via psql visible via the variable SHELL_EXIT_CODE. --- doc/src/sgml/ref/psql-ref.sgml | 9 + src/bin/psql/command.c | 4 src/bin/psql/help.c| 2 ++ src/bin/psql/psqlscanslash.l | 24 +--- 4 files changed, 36 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 8a5285da9a..d0c80b4528 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -4260,6 +4260,15 @@ bar + + SHELL_EXIT_CODE + + + The exit code return by the last shell command. 0 means no error. + + + + SHOW_ALL_RESULTS diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index de6a3a71f8..f6d6a489a9 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -4998,6 +4998,7 @@ static bool do_shell(const char *command) { int result; + char exit_code_buf[32]; fflush(NULL); if (!command) @@ -5025,6 +5026,9 @@ do_shell(const char *command) else result = system(command); + snprintf(exit_code_buf, sizeof(exit_code_buf), "%d", WEXITSTATUS(result)); + SetVariable(pset.vars, "SHELL_EXIT_CODE", exit_code_buf); + if (result == 127 || result == -1) { pg_log_error("\\!: failed"); diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index b4e0ec2687..caf13e2ed2 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -455,6 +455,8 @@ helpVariables(unsigned short int pager) "show all results of a combined query (\\;) instead of only the last\n"); HELP0(" SHOW_CONTEXT\n" "controls display of message context fields [never, errors, always]\n"); + HELP0(" SHELL_EXIT_CODE\n" + "Exit code of the last shell command\n"); HELP0(" SINGLELINE\n" "if set, end of line terminates SQL commands (same as -S option)\n"); HELP0(" SINGLESTEP\n" diff -
Re: Common function for percent placeholder replacement
> > How about this new one with variable arguments? I like this a lot, but I also see merit in Alvaro's PERCENT_OPT variadic, which at least avoids the two lists getting out of sync. Initially, I was going to ask that we have shell-quote-safe equivalents of whatever fixed parameters we baked in, but this allows the caller to do that as needed. It seems we could now just copy quote_identifier and strip out the keyword checks to get the desired effect. Has anyone else had a need for quote-safe args in the shell commands?
CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions
Attached is my work in progress to implement the changes to the CAST() function as proposed by Vik Fearing. This work builds upon the Error-safe User Functions work currently ongoing. The proposed changes are as follows: CAST(expr AS typename) continues to behave as before. CAST(expr AS typename ERROR ON ERROR) has the identical behavior as the unadorned CAST() above. CAST(expr AS typename NULL ON ERROR) will use error-safe functions to do the cast of expr, and will return NULL if the cast fails. CAST(expr AS typename DEFAULT expr2 ON ERROR) will use error-safe functions to do the cast of expr, and will return expr2 if the cast fails. There is an additional FORMAT parameter that I have not yet implemented, my understanding is that it is largely intended for DATE/TIME field conversions, but others are certainly possible. CAST(expr AS typename FORMAT fmt DEFAULT expr2 ON ERROR) What is currently working: - Any scalar expression that can be evaluated at parse time. These tests from cast.sql all currently work: VALUES (CAST('error' AS integer)); VALUES (CAST('error' AS integer ERROR ON ERROR)); VALUES (CAST('error' AS integer NULL ON ERROR)); VALUES (CAST('error' AS integer DEFAULT 42 ON ERROR)); SELECT CAST('{123,abc,456}' AS integer[] DEFAULT '{-789}' ON ERROR) as array_test1; - Scalar values evaluated at runtime. CREATE TEMPORARY TABLE t(t text); INSERT INTO t VALUES ('a'), ('1'), ('b'), (2); SELECT CAST(t.t AS integer DEFAULT -1 ON ERROR) AS foo FROM t; foo - -1 1 -1 2 (4 rows) Along the way, I made a few design decisions, each of which is up for debate: First, I created OidInputFunctionCallSafe, which is to OidInputFunctionCall what InputFunctionCallSafe is to InputFunctionCall. Given that the only place I ended up using it was stringTypeDatumSafe(), it may be possible to just move that code inside stringTypeDatumSafe. Next, I had a need for FuncExpr, CoerceViaIO, and ArrayCoerce to all report if their expr argument failed, and if not, just past the evaluation of expr2. Rather than duplicate this logic in several places, I chose instead to modify CoalesceExpr to allow for an error-test mode in addition to its default null-test mode, and then to provide this altered node with two expressions, the first being the error-safe typecast of expr and the second being the non-error-safe typecast of expr2. I still don't have array-to-array casts working, as the changed I would likely need to make to ArrayCoerce get somewhat invasive, so this seemed like a good time to post my work so far and solicit some feedback beyond what I've already been getting from Jeff Davis and Michael Paquier. I've sidestepped domains as well for the time being as well as avoiding JIT issues entirely. No documentation is currently prepared. All but one of the regression test queries work, the one that is currently failing is: SELECT CAST('{234,def,567}'::text[] AS integer[] DEFAULT '{-1011}' ON ERROR) as array_test2; Other quirks: - an unaliased CAST ON DEFAULT will return the column name of "coalesce", which internally is true, but obviously would be quite confusing to a user. As a side observation, I noticed that the optimizer already tries to resolve expressions based on constants and to collapse expression trees where possible, which makes me wonder if the work done to do the same in transformTypeCast/ and coerce_to_target_type and coerce_type isn't also wasted. From 897c9c68a29ad0fa57f28734df0c77553e026d80 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Sun, 18 Dec 2022 16:20:01 -0500 Subject: [PATCH 1/3] add OidInputFunctionCall --- src/backend/utils/fmgr/fmgr.c | 13 + src/include/fmgr.h| 4 2 files changed, 17 insertions(+) diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c index 0d37f69298..e9a19ce653 100644 --- a/src/backend/utils/fmgr/fmgr.c +++ b/src/backend/utils/fmgr/fmgr.c @@ -1701,6 +1701,19 @@ OidInputFunctionCall(Oid functionId, char *str, Oid typioparam, int32 typmod) return InputFunctionCall(, str, typioparam, typmod); } +bool +OidInputFunctionCallSafe(Oid functionId, char *str, Oid typioparam, + int32 typmod, fmNodePtr escontext, + Datum *result) +{ + FmgrInfo flinfo; + + fmgr_info(functionId, ); + + return InputFunctionCallSafe(, str, typioparam, typmod, + escontext, result); +} + char * OidOutputFunctionCall(Oid functionId, Datum val) { diff --git a/src/include/fmgr.h b/src/include/fmgr.h index b7832d0aa2..b835ef72b5 100644 --- a/src/include/fmgr.h +++ b/src/include/fmgr.h @@ -706,6 +706,10 @@ extern bool InputFunctionCallSafe(FmgrInfo *flinfo, char *str, Datum *result); extern Datum OidInputFunctionCall(Oid functionId, char *str, Oid typioparam, int32 typmod); +extern bool +OidInputFunctionCallSafe(Oid functionId, char *str, Oid typioparam, + int32 typmod, fmNodePtr escontext, + Datum *result); extern char *OutputFunctionCall(FmgrInfo
Re: Error-safe user functions
On Sat, Dec 10, 2022 at 9:20 AM Tom Lane wrote: > Alvaro Herrera writes: > > On 2022-Dec-09, Tom Lane wrote: > >> ... So I think it might be > >> okay to say "if you want soft error treatment for a domain, > >> make sure its check constraints don't throw errors". > > > I think that's fine. If the user does, say "CHECK (value > 0)" and that > > results in a soft error, that seems to me enough support for now. If > > they want to do something more elaborate, they can write C functions. > > Maybe eventually we'll want to offer some other mechanism that doesn't > > require C, but let's figure out what the requirements are. I don't > > think we know that, at this point. > > A fallback we can offer to anyone with such a problem is "write a > plpgsql function and wrap the potentially-failing bit in an exception > block". Then they get to pay the cost of the subtransaction, while > we're not imposing one on everybody else. > > regards, tom lane > That exception block will prevent parallel plans. I'm not saying it isn't the best way forward for us, but wanted to make that side effect clear.
Re: Error-safe user functions
On Fri, Dec 9, 2022 at 11:17 AM Amul Sul wrote: > On Fri, Dec 9, 2022 at 9:08 PM Andrew Dunstan wrote: > > > > > > On 2022-12-09 Fr 10:16, Tom Lane wrote: > > > Andrew Dunstan writes: > > >> On 2022-12-08 Th 21:59, Tom Lane wrote: > > >>> Yeah, I was planning to take a look at that before walking away from > > >>> this stuff. (I'm sure not volunteering to convert ALL the input > > >>> functions, but I'll do the datetime code.) > > >> Awesome. Perhaps if there are no more comments you can commit what you > > >> currently have so people can start work on other input functions. > > > Pushed. > > > > > > Great! > > > > > > > As I said, I'll take a look at the datetime area. Do we > > > have any volunteers for other input functions? > > > > > > > > > > > > I am currently looking at the json types. I think that will be enough to > > let us rework the sql/json patches as discussed a couple of months ago. > > > > I will pick a few other input functions, thanks. > > Regards, > Amul > I can do a few as well, as I need them done for the CAST With Default effort. Amul, please let me know which ones you pick so we don't duplicate work.
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 12:17 PM Tom Lane wrote: > Corey Huinker writes: > > In my attempt to implement CAST...DEFAULT, I noticed that I immediately > > needed an > > OidInputFunctionCallSafe, which was trivial but maybe something we want > to > > add to the infra patch, but the comments around that function also > somewhat > > indicate that we might want to just do the work in-place and call > > InputFunctionCallSafe directly. Open to both ideas. > > I'm a bit skeptical of that. IMO using OidInputFunctionCall is only > appropriate in places that will be executed just once per query. > That is what's happening when the expr of the existing CAST ( expr AS typename ) is a constant and we want to just resolve the constant at parse time. > >
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 9:20 AM Tom Lane wrote: > Andrew Dunstan writes: > > Perhaps we should add a type in the regress library that will never have > > a safe input function, so we can test that the mechanism works as > > expected in that case even after we adjust all the core data types' > > input functions. > > I was intending that the existing "widget" type be that. 0003 already > adds a comment to widget_in saying not to "fix" its one ereport call. > > Returning to the naming quagmire -- it occurred to me just now that > it might be helpful to call this style of error reporting "soft" > errors rather than "safe" errors, which'd provide a nice contrast > with "hard" errors thrown by longjmp'ing. That would lead to naming > all the variant functions XXXSoft not XXXSafe. There would still > be commentary to the effect that "soft errors must be safe, in the > sense that there's no question whether it's safe to continue > processing the transaction". Anybody think that'd be an > improvement? In my attempt to implement CAST...DEFAULT, I noticed that I immediately needed an OidInputFunctionCallSafe, which was trivial but maybe something we want to add to the infra patch, but the comments around that function also somewhat indicate that we might want to just do the work in-place and call InputFunctionCallSafe directly. Open to both ideas. Looking forward cascades up into coerce_type and its brethren, and reimplementing those from a Node returner to a boolean returner with a Node parameter seems a bit of a stretch, so I have to pick a point where the code pivots from passing down a safe-mode indicator and passing back a found_error indicator (which may be combine-able, as safe is always true when the found_error pointer is not null, and always false when it isn't), but for the most part things look do-able.
Re: Error-safe user functions
On Tue, Dec 6, 2022 at 6:46 AM Andrew Dunstan wrote: > > On 2022-12-05 Mo 20:06, Tom Lane wrote: > > Andres Freund writes: > > > >> But perhaps it's even worth having such a function properly exposed: > >> It's not at all rare to parse text data during ETL and quite often > >> erroring out fatally is undesirable. As savepoints are undesirable > >> overhead-wise, there's a lot of SQL out there that tries to do a > >> pre-check about whether some text could be cast to some other data > >> type. A function that'd try to cast input to a certain type without > >> erroring out hard would be quite useful for that. > > Corey and Vik are already talking about a non-error CAST variant. > > > /metoo! :-) > > > > Maybe we should leave this in abeyance until something shows up > > for that? Otherwise we'll be making a nonstandard API for what > > will probably ultimately be SQL-spec functionality. I don't mind > > that as regression-test infrastructure, but I'm a bit less excited > > about exposing it as a user feature. > > > > > I think a functional mechanism could be very useful. Who knows when the > standard might specify something in this area? > > > Vik's working on the standard (he put the spec in earlier in this thread). I'm working on implementing it on top of Tom's work, but I'm one patchset behind at the moment. Once completed, it should be leverage-able in several places, COPY being the most obvious. What started all this was me noticing that if I implemented TRY_CAST in pl/pgsql with an exception block, then I wasn't able to use parallel query.
Re: ANY_VALUE aggregate
On Mon, Dec 5, 2022 at 12:57 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Mon, Dec 5, 2022 at 7:57 AM Vik Fearing > wrote: > >> The SQL:2023 Standard defines a new aggregate named ANY_VALUE. It >> returns an implementation-dependent (i.e. non-deterministic) value from >> the rows in its group. >> >> PFA an implementation of this aggregate. >> >> > Can we please add "first_value" and "last_value" if we are going to add > "some_random_value" to our library of aggregates? > > Also, maybe we should have any_value do something like compute a 50/50 > chance that any new value seen replaces the existing chosen value, instead > of simply returning the first value all the time. Maybe even prohibit the > first value from being chosen so long as a second value appears. > > David J. > Adding to the pile of wanted aggregates: in the past I've lobbied for only_value() which is like first_value() but it raises an error on encountering a second value.
Re: Error-safe user functions
On Mon, Dec 5, 2022 at 1:00 PM Tom Lane wrote: > Andrew Dunstan writes: > > Wait a minute! Oh, no, sorry, as you were, 'errsave' is fine. > > Seems like everybody's okay with errsave. I'll make a v2 in a > little bit. I'd like to try updating array_in and/or record_in > just to verify that indirection cases work okay, before we consider > the design to be set. > +1 to errsave.
Re: Error-safe user functions
On Mon, Dec 5, 2022 at 11:36 AM Andrew Dunstan wrote: > > On 2022-12-05 Mo 11:20, Robert Haas wrote: > > On Mon, Dec 5, 2022 at 11:09 AM Tom Lane wrote: > >> Robert Haas writes: > >>> On Sat, Dec 3, 2022 at 10:57 PM Corey Huinker > wrote: > >>>> 2. ereturn_* => errfeedback / error_feedback / feedback > >>> Oh, I like that, especially errfeedback. > >> efeedback? But TBH I do not think any of these are better than ereturn. > > I do. Having a macro name that is "return" plus one character is going > > to make people think that it returns. I predict that if you insist on > > using that name people are still going to be making mistakes based on > > that confusion 10 years from now. > > > > OK, I take both this point and Tom's about trying to keep it the same > length. So we need something that's 7 letters, doesn't say 'return' and > preferably begins with 'e'. I modestly suggest 'eseterr', or if we like > the 'feedback' idea 'efeedbk'. > > > Consulting https://www.thesaurus.com/browse/feedback again: ereply clocks in at 7 characters.
Re: Add SHELL_EXIT_CODE to psql
Rebased. Still waiting on feedback before working on documentation. On Fri, Nov 4, 2022 at 5:23 AM Corey Huinker wrote: > Oops, that sample output was from a previous run, should have been: > > -- SHELL_EXIT_CODE is undefined > \echo :SHELL_EXIT_CODE > :SHELL_EXIT_CODE > -- bad \! > \! borp > sh: line 1: borp: command not found > \echo :SHELL_EXIT_CODE > 127 > -- bad backtick > \set var `borp` > sh: line 1: borp: command not found > \echo :SHELL_EXIT_CODE > 127 > -- good \! > \! true > \echo :SHELL_EXIT_CODE > 0 > -- play with exit codes > \! exit 4 > \echo :SHELL_EXIT_CODE > 4 > \set var `exit 3` > \echo :SHELL_EXIT_CODE > 3 > > > On Fri, Nov 4, 2022 at 5:08 AM Corey Huinker > wrote: > >> >> Over in >> https://www.postgresql.org/message-id/eaf326ad693e74eba068f33a7f518...@oss.nttdata.com >> Justin >> Pryzby suggested that psql might need the ability to capture the shell exit >> code. >> >> This is a POC patch that does that, but doesn't touch on the >> ON_ERROR_STOP stuff. >> >> I've added some very rudimentary tests, but haven't touched the >> documentation, because I strongly suspect that someone will suggest a >> better name for the variable. >> >> But basically, it works like this >> >> -- SHELL_EXIT_CODE is undefined >> \echo :SHELL_EXIT_CODE >> :SHELL_EXIT_CODE >> -- bad \! >> \! borp >> sh: line 1: borp: command not found >> \echo :SHELL_EXIT_CODE >> 32512 >> -- bad backtick >> \set var `borp` >> sh: line 1: borp: command not found >> \echo :SHELL_EXIT_CODE >> 127 >> -- good \! >> \! true >> \echo :SHELL_EXIT_CODE >> 0 >> -- play with exit codes >> \! exit 4 >> \echo :SHELL_EXIT_CODE >> 1024 >> \set var `exit 3` >> \echo :SHELL_EXIT_CODE >> 3 >> >> >> Feedback welcome. >> > From fef0e52706c9136659f0f8846bae289f0dbfb469 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Fri, 4 Nov 2022 04:45:39 -0400 Subject: [PATCH] POC: expose shell exit code as a psql variable --- src/bin/psql/command.c | 4 src/bin/psql/help.c| 2 ++ src/bin/psql/psqlscanslash.l | 28 +--- src/test/regress/expected/psql.out | 11 +++ src/test/regress/sql/psql.sql | 7 +++ 5 files changed, 49 insertions(+), 3 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index de6a3a71f8..f6d6a489a9 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -4998,6 +4998,7 @@ static bool do_shell(const char *command) { int result; + char exit_code_buf[32]; fflush(NULL); if (!command) @@ -5025,6 +5026,9 @@ do_shell(const char *command) else result = system(command); + snprintf(exit_code_buf, sizeof(exit_code_buf), "%d", WEXITSTATUS(result)); + SetVariable(pset.vars, "SHELL_EXIT_CODE", exit_code_buf); + if (result == 127 || result == -1) { pg_log_error("\\!: failed"); diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index b4e0ec2687..caf13e2ed2 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -455,6 +455,8 @@ helpVariables(unsigned short int pager) "show all results of a combined query (\\;) instead of only the last\n"); HELP0(" SHOW_CONTEXT\n" "controls display of message context fields [never, errors, always]\n"); + HELP0(" SHELL_EXIT_CODE\n" + "Exit code of the last shell command\n"); HELP0(" SINGLELINE\n" "if set, end of line terminates SQL commands (same as -S option)\n"); HELP0(" SINGLESTEP\n" diff --git a/src/bin/psql/psqlscanslash.l b/src/bin/psql/psqlscanslash.l index a467b72144..30e6f5dcd4 100644 --- a/src/bin/psql/psqlscanslash.l +++ b/src/bin/psql/psqlscanslash.l @@ -27,6 +27,8 @@ %{ #include "fe_utils/psqlscan_int.h" +#include "settings.h" +#include "variables.h" /* * We must have a typedef YYSTYPE for yylex's first argument, but this lexer @@ -774,6 +776,8 @@ evaluate_backtick(PsqlScanState state) bool error = false; char buf[512]; size_t result; + int exit_code = 0; + char exit_code_buf[32]; initPQExpBuffer(_output); @@ -783,6 +787,7 @@ evaluate_backtick(PsqlScanState state) { pg_log_error("%s: %m", cmd); error = true; + exit_code = -1; } if (!error) @@ -800,10 +805,25 @@ evaluate_backtick(PsqlScanState state) } while (!feof(fd)); } - if (fd && pclose(fd) == -1) + if (fd) { - pg_log_error("%s: %m", cmd); - error = true; + exit_code = pclose(fd); + if (exit_code == -1) + { + pg_log_error("%s: %m", cmd); + error = true; + } + if
Re: Make ON_ERROR_STOP stop on shell script failure
On Tue, Nov 22, 2022 at 6:16 PM Matheus Alcantara wrote: > --- Original Message --- > On Tuesday, November 22nd, 2022 at 20:10, bt22nakamorit < > bt22nakamo...@oss.nttdata.com> wrote: > > > > There was a mistake in the error message for \! so I updated the patch. > > > > Best, > > Tatsuhiro Nakamori > > Hi > > I was checking your patch and seems that it failed to be applied into the > master cleanly. Could you please rebase it? > Yes. My apologies, I had several life events get in the way.
Re: Error-safe user functions
> > I think there are just a couple of loose ends here: > > 1. Bikeshedding on my name choices is welcome. I know Robert is > dissatisfied with "ereturn", but I'm content with that so I didn't > change it here. > 1. details_please => include_error_data as this hints the reader directly to the struct to be filled out 2. ereturn_* => errfeedback / error_feedback / efeedback It is returned, but it's not taking control and the caller could ignore it. I arrived at his after checking https://www.thesaurus.com/browse/report and https://www.thesaurus.com/browse/hint. > 2. Everybody has struggled with just where to put the declaration > of the error context structure. The most natural home for it > probably would be elog.h, but that's out because it cannot depend > on nodes.h, and the struct has to be a Node type to conform to > the fmgr safety guidelines. What I've done here is to drop it > in nodes.h, as we've done with a couple of other hard-to-classify > node types; but I can't say I'm satisfied with that. > > Other plausible answers seem to be: > > * Drop it in fmgr.h. The only real problem is that historically > we've not wanted fmgr.h to depend on nodes.h either. But I'm not > sure how strong the argument for that really is/was. If we did > do it like that we could clean up a few kluges, both in this patch > and pre-existing (fmNodePtr at least could go away). > > * Invent a whole new header just for this struct. But then we're > back to the question of what to call it. Maybe something along the > lines of utils/elog_extras.h ? > Would moving ErrorReturnContext and the ErrorData struct to their own util/errordata.h allow us to avoid the void pointer for ercontext? If so, that'd be a win because typed pointers give the reader some idea of what is expected there, as well as aiding doxygen-like tools. Overall this looks like a good foundation. My own effort was getting bogged down in the number of changes I needed to make in code paths that would never want a failover case for their typecasts, so I'm going to refactor my work on top of this and see where I get stuck.
Re: Error-safe user functions
On Fri, Dec 2, 2022 at 1:46 PM Tom Lane wrote: > Corey Huinker writes: > > I'm still working on organizing my patch, but it grew out of a desire to > do > > this: > > CAST(value AS TypeName DEFAULT expr) > > This is a thing that exists in other forms in other databases and while > it > > may look unrelated, it is essentially the SQL/JSON casts within a nested > > data structure issue, just a lot simpler. > > Okay, maybe that's why I was thinking we had a requirement for > failure-free casts. Sure, you can transform it to the other thing > by always implementing this as a cast-via-IO, but you could run into > semantics issues that way. (If memory serves, we already have cases > where casting X to Y gives a different result from casting X to text > to Y.) > Yes, I was setting aside the issue of direct cast functions for my v0.1 > > > My original plan had been two new params to all _in() functions: a > boolean > > error_mode and a default expression Datum. > > After consulting with Jeff Davis and Michael Paquier, the notion of > > modifying fcinfo itself two booleans: > > allow_error (this call is allowed to return if there was an error with > > INPUT) and > > has_error (this function has the concept of a purely-input-based error, > > and found one) > > Hmm ... my main complaint about that is the lack of any way to report > the details of the error. I realize that a plain boolean failure flag > might be enough for our immediate use-cases, but I don't want to expend > the amount of effort this is going to involve and then later find we > have a use-case where we need the error details. > I agree, but then we're past a boolean for allow_error, and we probably get into a list of modes like this: CAST_ERROR_ERROR /* default ereport(), what we do now */ CAST_ERROR_REPORT_FULL /* report that the cast failed, everything that you would have put in the ereport() instead put in a struct that gets returned to caller */ CAST_ERROR_REPORT_SILENT /* report that the cast failed, but nobody cares why so don't even form the ereport strings, good for bulk operations */ CAST_ERROR_WARNING /* report that the cast failed, but emit ereport() as a warning */ CAST_ERROR_[NOTICE,LOG,DEBUG1,..DEBUG5] /* same, but some other loglevel */ > > The sketch that's in my head at the moment is to make use of the existing > "context" field of FunctionCallInfo: if that contains a node of some > to-be-named type, then we are requesting that errors not be thrown > but instead be reported by passing back an ErrorData using a field of > that node. The issue about not constructing an ErrorData if the outer > caller doesn't need it could perhaps be addressed by adding some boolean > flag fields in that node, but the details of that need not be known to > the functions reporting errors this way; it'd be a side channel from the > outer caller to elog.c. > That should be a good place for it, assuming it's not already used like fn_extra is. It would also squash those cases above into just three: ERROR, REPORT_FULL, and REPORT_SILENT, leaving it up to the node what type of erroring/logging is appropriate. > > The main objection I can see to this approach is that we only support > one context value per call, so you could not easily combine this > functionality with existing use-cases for the context field. A quick > census of InitFunctionCallInfoData calls finds aggregates, window > functions, triggers, and procedures, none of which seem like plausible > candidates for wanting no-error behavior, so I'm not too concerned > about that. (Maybe the error-reporting node could be made a sub-node > of the context node in any future cases where we do need it?) > A subnode had occurred to me when fiddling about with fn_extra, so so that applies here, but if we're doing a sub-node, then maybe it's worth it's own parameter. I struggled with that because of how few of these functions will use it vs how often they're executed. > > > The one gap I see so far in the patch presented is that it returns a null > > value on bad input, which might be ok if the node has the default, but > that > > then presents the node with having to understand whether it was a null > > because of bad input vs a null that was expected. > > Yeah. That's something we could probably get away with for the case of > input functions only, but I think explicit out-of-band signaling that > there was an error is a more future-proof solution. > I think we'll run into it fairly soon, because if I recall correctly, SQL/JSON has a formatting spec that essentially means that we're not calling input functions, we're calling TO_CHAR() and TO_DATE(), but they're very similiar to input functions. One positive side effect of all this is we can get a isa(value, typname) construct like this "for free", we just try the cast and return the value. I'm still working on my patch even though it will probably be sidelined in the hopes that it informs us of any subsequent issues.
Re: Error-safe user functions
On Fri, Dec 2, 2022 at 9:34 AM Andrew Dunstan wrote: > > On 2022-12-02 Fr 09:12, Tom Lane wrote: > > Robert Haas writes: > >> I think the design is evolving in your head as you think about this > >> more, which is totally understandable and actually very good. However, > >> this is also why I think that you should produce the patch you > >> actually want instead of letting other people repeatedly submit > >> patches and then complain that they weren't what you had in mind. > > OK, Corey hasn't said anything, so I will have a look at this over > > the weekend. > > > > > > > Great. Let's hope we can get this settled early next week and then we > can get to work on the next tranche of functions, those that will let > the SQL/JSON work restart. > > > cheers > > > andrew > > -- > Andrew Dunstan > EDB: https://www.enterprisedb.com > > I'm still working on organizing my patch, but it grew out of a desire to do this: CAST(value AS TypeName DEFAULT expr) This is a thing that exists in other forms in other databases and while it may look unrelated, it is essentially the SQL/JSON casts within a nested data structure issue, just a lot simpler. My original plan had been two new params to all _in() functions: a boolean error_mode and a default expression Datum. After consulting with Jeff Davis and Michael Paquier, the notion of modifying fcinfo itself two booleans: allow_error (this call is allowed to return if there was an error with INPUT) and has_error (this function has the concept of a purely-input-based error, and found one) The nice part about this is that unaware functions can ignore these values, and custom data types that did not check these values would continue to work as before. It wouldn't respect the CAST default, but that's up to the extension writer to fix, and that's a pretty acceptable failure mode. Where this gets tricky is arrays and complex types: the default expression applies only to the object explicitly casted, so if somebody tried CAST ('{"123","abc"}'::text[] AS integer[] DEFAULT '{0}') the inner casts need to know that they _can_ allow input errors, but have no default to offer, they need merely report their failure upstream...and that's where the issues align with the SQL/JSON issue. In pursuing this, I see that my method was simultaneously too little and too much. Too much in the sense that it alters the structure for all fmgr functions, though in a very minor and back-compatible way, and too little in the sense that we could actually return the ereport info in a structure and leave it to the node to decide whether to raise it or not. Though I should add that there many situations where we don't care about the specifics of the error, we just want to know that one existed and move on, so time spent forming that return structure would be time wasted. The one gap I see so far in the patch presented is that it returns a null value on bad input, which might be ok if the node has the default, but that then presents the node with having to understand whether it was a null because of bad input vs a null that was expected.
Re: Error-safe user functions
On Fri, Dec 2, 2022 at 9:12 AM Tom Lane wrote: > Robert Haas writes: > > I think the design is evolving in your head as you think about this > > more, which is totally understandable and actually very good. However, > > this is also why I think that you should produce the patch you > > actually want instead of letting other people repeatedly submit > > patches and then complain that they weren't what you had in mind. > > OK, Corey hasn't said anything, so I will have a look at this over > the weekend. > > regards, tom lane > Sorry, had several life issues intervene. Glancing over what was discussed because it seems pretty similar to what I had in mind. Will respond back in detail shortly.
Re: psql: Add command to use extended query protocol
On Tue, Nov 15, 2022 at 8:29 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 09.11.22 00:12, Corey Huinker wrote: > > As for the docs, they're very clear and probably sufficient as-is, but I > > wonder if we should we explicitly state that the bind-state and bind > > parameters do not "stay around" after the query is executed? Suggestions > > in bold: > > > > This command causes the extended query protocol (see > linkend="protocol-query-concepts"/>) to be used, unlike normal > > psql operation, which uses the > simple > > query protocol. *Extended query protocol will be used* *even > > if no parameters are specified, s*o this command can be useful to test > > the extended > > query protocol from psql. *This command affects only the next > > query executed, all subsequent queries will use the regular query > > protocol by default.* > > > > Tests seem comprehensive. I went looking for the TAP test that this > > would have replaced, but found none, and it seems the only test where we > > exercise PQsendQueryParams is libpq_pipeline.c, so these tests are a > > welcome addition. > > > > Aside from the possible doc change, it looks ready to go. > > Committed with those doc changes. Thanks. > > I got thinking about this, and while things may be fine as-is, I would like to hear some opinions as to whether this behavior is correct: String literals can include spaces [16:51:35 EST] corey=# select $1, $2 \bind 'abc def' gee \g ?column? | ?column? --+-- abc def | gee (1 row) String literal includes spaces, but also includes quotes: Time: 0.363 ms [16:51:44 EST] corey=# select $1, $2 \bind "abc def" gee \g ?column? | ?column? ---+-- "abc def" | gee (1 row) Semi-colon does not terminate an EQP statement, ';' is seen as a parameter: [16:51:47 EST] corey=# select $1, $2 \bind "abc def" gee ; corey-# \g ERROR: bind message supplies 3 parameters, but prepared statement "" requires 2 Confirming that slash-commands must be unquoted [16:52:23 EST] corey=# select $1, $2 \bind "abc def" '\\g' \g ?column? | ?column? ---+-- "abc def" | \g (1 row) [16:59:00 EST] corey=# select $1, $2 \bind "abc def" '\watch' \g ?column? | ?column? ---+-- "abc def" | watch (1 row) Confirming that any slash command terminates the bind list, but ';' does not [16:59:54 EST] corey=# select $1, $2 \bind "abc def" gee \watch 5 Mon 21 Nov 2022 05:00:07 PM EST (every 5s) ?column? | ?column? ---+-- "abc def" | gee (1 row) Time: 0.422 ms Mon 21 Nov 2022 05:00:12 PM EST (every 5s) ?column? | ?column? ---+-- "abc def" | gee (1 row) Is this all working as expected?
Re: Error-safe user functions
On Tue, Nov 15, 2022 at 11:36 AM Andrew Dunstan wrote: > > On 2022-10-07 Fr 13:37, Tom Lane wrote: > > > [ lots of detailed review ] > > > Basically, this patch set should be a lot smaller and not have ambitions > > beyond "get the API right" and "make one or two datatypes support COPY > > NULL_ON_ERROR". Add more code once that core functionality gets reviewed > > and committed. > > > > > > > Nikita, > > just checking in, are you making progress on this? I think we really > need to get this reviewed and committed ASAP if we are to have a chance > to get the SQL/JSON stuff reworked to use it in time for release 16. > > I'm making an attempt at this or something very similar to it. I don't yet have a patch ready.
Re: Multitable insert syntax support on Postgres?
> > WITH data_src AS (SELECT * FROM source_tbl), > insert_a AS (INSERT INTO a SELECT * FROM data_src WHERE d < 5), > insert_b AS (INSERT INTO b SELECT * FROM data_src WHERE d >= 5) > INSERT INTO c SELECT * FROM data_src WHERE d < 5 > I suppose you could just do a dummy SELECT at the bottom to make it look more symmetrical WITH data_src AS (SELECT * FROM source_tbl), insert_a AS (INSERT INTO a SELECT * FROM data_src WHERE d < 5), insert_b AS (INSERT INTO b SELECT * FROM data_src WHERE d >= 5) insert_c AS (INSERT INTO c SELECT * FROM data_src WHERE d < 5) SELECT true AS inserts_complete; Or maybe get some diagnostics out of it: WITH data_src AS (SELECT * FROM source_tbl), insert_a AS (INSERT INTO a SELECT * FROM data_src WHERE d < 5 RETURNING NULL), insert_b AS (INSERT INTO b SELECT * FROM data_src WHERE d >= 5 RETURNING NULL), insert_c AS (INSERT INTO c SELECT * FROM data_src WHERE d < 5 RETURNING NULL) SELECT (SELECT COUNT(*) FROM insert_a) AS new_a_rows, (SELECT COUNT(*) FROM insert_b) AS new_b_rows, (SELECT COUNT(*) FROM insert_c) AS new_c_rows;
Re: Multitable insert syntax support on Postgres?
On Mon, Nov 14, 2022 at 7:06 PM Alexandre hadjinlian guerra < alexhgue...@gmail.com> wrote: > Hello > Are there any plans to incorporate a formal syntax multitable/conditional > insert , similar to the syntax below? snowflake does have the same feature > > https://oracle-base.com/articles/9i/multitable-inserts > > Today, im resorting to a function that receives the necessary parameters > from the attributes definition/selection area in a sql select query, called > by each tuple retrieved. A proper syntax show be real cool > > Thanks! > I'm not aware of any efforts to implement this at this time, mostly because I don't think it's supported in the SQL Standard. Being in the standard would change the question from "why" to "why not". I've used that feature when I worked with Oracle in a data warehouse situation. I found it most useful when migrating data dumps from mainframes where the data file contained subrecords and in cases where one field in a row changes the meaning of subsequent fields in the same row. That may sound like a First Normal Form violation, and it is, but such data formats are common in the IBM VSAM world, or at least they were in the data dumps that I had to import.
Re: Document parameter count limit
> > > +if you are reading this prepatorily, please redesign your > query to use temporary tables or arrays > I agree with the documentation of this parameter. I agree with dissuading anyone from attempting to change it The wording is bordering on snark (however well deserved) and I think the voice is slightly off. Alternate suggestion: Queries approaching this limit usually can be refactored to use arrays or temporary tables, thus reducing parameter overhead. The bit about parameter overhead appeals to the reader's desire for performance, rather than just focusing on "you shouldn't want this".
Re: refactor ownercheck and aclcheck functions
> > After considering this again, I decided to brute-force this and get rid > of all the trivial wrapper functions and also several of the special > cases. That way, there is less confusion at the call sites about why > this or that style is used in a particular case. Also, it now makes > sure you can't accidentally use the generic functions when a particular > one should be used. > +1 However, the aclcheck patch isn't applying for me now. That patch modifies 37 files, so it's hard to say just which commit conflicts.
Re: psql: Add command to use extended query protocol
> > > Btw., this also allows doing things like > > SELECT $1, $2 > \bind '1' '2' \g > \bind '3' '4' \g > That's one of the things I was hoping for. Very cool. > > This isn't a prepared statement being reused, but it relies on the fact > that psql \g with an empty query buffer resends the previous query. > Still kind of neat. Yeah, if they wanted a prepared statement there's nothing stopping them. Review: Patch applies, tests pass. Code is quite straightforward. As for the docs, they're very clear and probably sufficient as-is, but I wonder if we should we explicitly state that the bind-state and bind parameters do not "stay around" after the query is executed? Suggestions in bold: This command causes the extended query protocol (see ) to be used, unlike normal psql operation, which uses the simple query protocol. *Extended query protocol will be used* *even if no parameters are specified, s*o this command can be useful to test the extended query protocol from psql. *This command affects only the next query executed, all subsequent queries will use the regular query protocol by default.* Tests seem comprehensive. I went looking for the TAP test that this would have replaced, but found none, and it seems the only test where we exercise PQsendQueryParams is libpq_pipeline.c, so these tests are a welcome addition. Aside from the possible doc change, it looks ready to go.
Re: psql: Add command to use extended query protocol
On Mon, Nov 7, 2022 at 4:12 PM Tom Lane wrote: > Corey Huinker writes: > > I thought about basically reserving the \$[0-9]+ space as bind variables, > > but it is possible, though unlikely, that users have been naming their > > variables like that. > > Don't we already reserve that syntax as Params? Not sure whether there > would be any conflicts versus Params, but these are definitely not legal > as SQL identifiers. > > regards, tom lane > I think Pavel was hinting at something like: \set $1 foo \set $2 123 UPDATE mytable SET value = $1 WHERE id = $2; Which wouldn't step on anything, because I tested it, and \set $1 foo already returns 'Invalid variable name "$1"'. So far, there seem to be two possible variations on how to go about this: 1. Have special variables or a variable namespace that are known to be bind variables. So long as one of them is defined, queries are sent using extended query protocol. 2. Bind parameters one-time-use, applied strictly to the query currently in the buffer in positional order, and once that query is run their association with being binds is gone. Each has its merits, I guess it comes down to how much we expect users to want to re-use some or all the bind params of the previous query.
Re: psql: Add command to use extended query protocol
> > > > what about introduction new syntax for psql variables that should be > passed as bind variables. > I thought about basically reserving the \$[0-9]+ space as bind variables, but it is possible, though unlikely, that users have been naming their variables like that. It's unclear from your example if that's what you meant, or if you wanted actual named variables ($name, $timestamp_before, $x). Actual named variables might cause problems withCREATE FUNCTION AS ... $body$ ... $body$; as well as the need to deduplicate them. So while it is less seamless, I do like the \bind x y z \g idea because it requires no changes in variable interpolation, and the list can be terminated with a slash command or ; To your point about forcing extended query protocol even when no parameters are, that would be SELECT 1 \bind \g It hasn't been discussed, but the question of how to handle output parameters seems fairly straightforward: the value of the bind variable is the name of the psql variable to be set a la \gset.
Re: pg_dump: Refactor code that constructs ALTER ... OWNER TO commands
On Wed, Nov 2, 2022 at 5:30 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 01.11.22 13:59, Corey Huinker wrote: > > On Mon, Oct 24, 2022 at 5:54 AM Peter Eisentraut > > > <mailto:peter.eisentr...@enterprisedb.com>> wrote: > > > > Avoid having to list all the possible object types twice. Instead, > > only > > _getObjectDescription() needs to know about specific object types. > It > > communicates back to _printTocEntry() whether an owner is to be set. > > > > In passing, remove the logic to use ALTER TABLE to set the owner of > > views and sequences. This is no longer necessary. Furthermore, if > > pg_dump doesn't recognize the object type, this is now a fatal error, > > not a warning. > > > > > > Makes sense, passes all tests. > > Committed. > > > It's clearly out of scope for this very focused patch, but would it make > > sense for the TocEntry struct to be populated with an type enumeration > > integer as well as the type string to make for clearer and faster > > sifting later? > > That could be better, but wouldn't that mean a change of the format of > pg_dump archives? > Sorry for the confusion, I was thinking strictly of the in memory representation after it is extracted from the dictionary.
Re: psql: Add command to use extended query protocol
On Fri, Nov 4, 2022 at 11:45 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 02.11.22 01:18, Corey Huinker wrote: > > > > SELECT $1, $2 \gp 'foo' 'bar' > > > > > > I think this is a great idea, but I foresee people wanting to send that > > output to a file or a pipe like \g allows. If we assume everything after > > the \gp is a param, don't we paint ourselves into a corner? > > Any thoughts on how that syntax could be generalized? > A few: The most compact idea I can think of is to have \bind and \endbind (or more terse equivalents \bp and \ebp) SELECT * FROM foo WHERE type_id = $1 AND cost > $2 \bind 'param1' 'param2' \endbind $2 \g filename.csv Maybe the end-bind param isn't needed at all, we just insist that bind params be single quoted strings or numbers, so the next slash command ends the bind list. If that proves difficult, we might save bind params like registers something like this, positional: \bind 1 'param1' \bind 2 'param2' SELECT * FROM foo WHERE type_id = $1 AND cost > $2 \g filename.csv \unbind or all the binds on one line \bindmany 'param1' 'param2' SELECT * FROM foo WHERE type_id = $1 AND cost > $2 \g filename.csv \unbind Then psql would merely have to check if it had any bound registers, and if so, the next query executed is extended query protocol, and \unbind wipes out the binds to send us back to regular mode.
Re: Add SHELL_EXIT_CODE to psql
Oops, that sample output was from a previous run, should have been: -- SHELL_EXIT_CODE is undefined \echo :SHELL_EXIT_CODE :SHELL_EXIT_CODE -- bad \! \! borp sh: line 1: borp: command not found \echo :SHELL_EXIT_CODE 127 -- bad backtick \set var `borp` sh: line 1: borp: command not found \echo :SHELL_EXIT_CODE 127 -- good \! \! true \echo :SHELL_EXIT_CODE 0 -- play with exit codes \! exit 4 \echo :SHELL_EXIT_CODE 4 \set var `exit 3` \echo :SHELL_EXIT_CODE 3 On Fri, Nov 4, 2022 at 5:08 AM Corey Huinker wrote: > > Over in > https://www.postgresql.org/message-id/eaf326ad693e74eba068f33a7f518...@oss.nttdata.com > Justin > Pryzby suggested that psql might need the ability to capture the shell exit > code. > > This is a POC patch that does that, but doesn't touch on the ON_ERROR_STOP > stuff. > > I've added some very rudimentary tests, but haven't touched the > documentation, because I strongly suspect that someone will suggest a > better name for the variable. > > But basically, it works like this > > -- SHELL_EXIT_CODE is undefined > \echo :SHELL_EXIT_CODE > :SHELL_EXIT_CODE > -- bad \! > \! borp > sh: line 1: borp: command not found > \echo :SHELL_EXIT_CODE > 32512 > -- bad backtick > \set var `borp` > sh: line 1: borp: command not found > \echo :SHELL_EXIT_CODE > 127 > -- good \! > \! true > \echo :SHELL_EXIT_CODE > 0 > -- play with exit codes > \! exit 4 > \echo :SHELL_EXIT_CODE > 1024 > \set var `exit 3` > \echo :SHELL_EXIT_CODE > 3 > > > Feedback welcome. >
Re: Make ON_ERROR_STOP stop on shell script failure
> > I think it'd be a lot better to expose the script status to psql. > (without having to write "foo; echo status=$?"). > I agree, and I hacked up a proof of concept, but started another thread at https://www.postgresql.org/message-id/CADkLM=cWao2x2f+UDw15W1JkVFr_bsxfstw=ngea7r9m4j-...@mail.gmail.com so as not to clutter up this one.
Add SHELL_EXIT_CODE to psql
Over in https://www.postgresql.org/message-id/eaf326ad693e74eba068f33a7f518...@oss.nttdata.com Justin Pryzby suggested that psql might need the ability to capture the shell exit code. This is a POC patch that does that, but doesn't touch on the ON_ERROR_STOP stuff. I've added some very rudimentary tests, but haven't touched the documentation, because I strongly suspect that someone will suggest a better name for the variable. But basically, it works like this -- SHELL_EXIT_CODE is undefined \echo :SHELL_EXIT_CODE :SHELL_EXIT_CODE -- bad \! \! borp sh: line 1: borp: command not found \echo :SHELL_EXIT_CODE 32512 -- bad backtick \set var `borp` sh: line 1: borp: command not found \echo :SHELL_EXIT_CODE 127 -- good \! \! true \echo :SHELL_EXIT_CODE 0 -- play with exit codes \! exit 4 \echo :SHELL_EXIT_CODE 1024 \set var `exit 3` \echo :SHELL_EXIT_CODE 3 Feedback welcome. From 119837575f4d0da804d92ec797bbf11e8075e595 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Fri, 4 Nov 2022 04:45:39 -0400 Subject: [PATCH] POC: expose shell exit code as a psql variable --- src/bin/psql/command.c | 4 src/bin/psql/help.c| 2 ++ src/bin/psql/psqlscanslash.l | 28 +--- src/test/regress/expected/psql.out | 11 +++ src/test/regress/sql/psql.sql | 7 +++ 5 files changed, 49 insertions(+), 3 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index ab613dd49e..4666a63051 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -4957,6 +4957,7 @@ static bool do_shell(const char *command) { int result; + char exit_code_buf[32]; fflush(NULL); if (!command) @@ -4984,6 +4985,9 @@ do_shell(const char *command) else result = system(command); + snprintf(exit_code_buf, sizeof(exit_code_buf), "%d", WEXITSTATUS(result)); + SetVariable(pset.vars, "SHELL_EXIT_CODE", exit_code_buf); + if (result == 127 || result == -1) { pg_log_error("\\!: failed"); diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index f8ce1a0706..04f996332e 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -454,6 +454,8 @@ helpVariables(unsigned short int pager) "show all results of a combined query (\\;) instead of only the last\n"); HELP0(" SHOW_CONTEXT\n" "controls display of message context fields [never, errors, always]\n"); + HELP0(" SHELL_EXIT_CODE\n" + "Exit code of the last shell command\n"); HELP0(" SINGLELINE\n" "if set, end of line terminates SQL commands (same as -S option)\n"); HELP0(" SINGLESTEP\n" diff --git a/src/bin/psql/psqlscanslash.l b/src/bin/psql/psqlscanslash.l index a467b72144..30e6f5dcd4 100644 --- a/src/bin/psql/psqlscanslash.l +++ b/src/bin/psql/psqlscanslash.l @@ -27,6 +27,8 @@ %{ #include "fe_utils/psqlscan_int.h" +#include "settings.h" +#include "variables.h" /* * We must have a typedef YYSTYPE for yylex's first argument, but this lexer @@ -774,6 +776,8 @@ evaluate_backtick(PsqlScanState state) bool error = false; char buf[512]; size_t result; + int exit_code = 0; + char exit_code_buf[32]; initPQExpBuffer(_output); @@ -783,6 +787,7 @@ evaluate_backtick(PsqlScanState state) { pg_log_error("%s: %m", cmd); error = true; + exit_code = -1; } if (!error) @@ -800,10 +805,25 @@ evaluate_backtick(PsqlScanState state) } while (!feof(fd)); } - if (fd && pclose(fd) == -1) + if (fd) { - pg_log_error("%s: %m", cmd); - error = true; + exit_code = pclose(fd); + if (exit_code == -1) + { + pg_log_error("%s: %m", cmd); + error = true; + } + if (WIFEXITED(exit_code)) + { + exit_code=WEXITSTATUS(exit_code); + } + else if(WIFSIGNALED(exit_code)) { + exit_code=WTERMSIG(exit_code); + } + else if(WIFSTOPPED(exit_code)) { + //If you need to act upon the process stopping, do it here. + exit_code=WSTOPSIG(exit_code); + } } if (PQExpBufferDataBroken(cmd_output)) @@ -826,5 +846,7 @@ evaluate_backtick(PsqlScanState state) appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len); } + snprintf(exit_code_buf, sizeof(exit_code_buf), "%d", exit_code); + SetVariable(pset.vars, "SHELL_EXIT_CODE", exit_code_buf); termPQExpBuffer(_output); } diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out index a7f5700edc..33202ffda7 100644 --- a/src/test/regress/expected/psql.out +++ b/src/test/regress/expected/psql.out @@ -1275,6 +1275,17 @@ execute q; ++-+ deallocate q; +-- test SHELL_EXIT_CODE +\! nosuchcommand +sh: line 1: nosuchcommand: command not found +\echo :SHELL_EXIT_CODE +127 +\set nosuchvar `nosuchcommand` +sh: line 1: nosuchcommand: command not found +\! nosuchcommand +sh: line 1: nosuchcommand: command not found +\echo :SHELL_EXIT_CODE +127 -- test single-line header and data prepare q as select repeat('x',2*n) as "0123456789abcdef", repeat('y',20-2*n) as "0123456789" from generate_series(1,10) as n; \pset
Re: psql: Add command to use extended query protocol
> > > SELECT $1, $2 \gp 'foo' 'bar' > > I think this is a great idea, but I foresee people wanting to send that output to a file or a pipe like \g allows. If we assume everything after the \gp is a param, don't we paint ourselves into a corner?
Re: pg_dump: Refactor code that constructs ALTER ... OWNER TO commands
On Mon, Oct 24, 2022 at 5:54 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > Avoid having to list all the possible object types twice. Instead, only > _getObjectDescription() needs to know about specific object types. It > communicates back to _printTocEntry() whether an owner is to be set. > > In passing, remove the logic to use ALTER TABLE to set the owner of > views and sequences. This is no longer necessary. Furthermore, if > pg_dump doesn't recognize the object type, this is now a fatal error, > not a warning. Makes sense, passes all tests. It's clearly out of scope for this very focused patch, but would it make sense for the TocEntry struct to be populated with an type enumeration integer as well as the type string to make for clearer and faster sifting later?
Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)
On Tue, Sep 20, 2022 at 2:32 PM Nathan Bossart wrote: > Here is a rebased patch for cfbot. > > > Applies, passes make check world. Patch is straightforward, but the previous code is less so. It purported to set XMAX_COMMITTED _or_ XMAX_INVALID, but never seemed to un-set XMAX_COMMITTED, was that the source of the double-setting?
Re: refactor ownercheck and aclcheck functions
On Fri, Oct 14, 2022 at 3:39 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > These patches take the dozens of mostly-duplicate pg_foo_ownercheck() > and pg_foo_aclcheck() functions and replace (most of) them by common > functions that are driven by the ObjectProperty table. All the required > information is already in that table. > > This is similar to the consolidation of the drop-by-OID functions that > we did a while ago (b1d32d3e3230f00b5baba08f75b4f665c7d6dac6). Nice reduction in footprint! I'd be inclined to remove the highly used ones as well. That way the codebase would have more examples of object_ownercheck() for readers to see. Seeing the existence of pg_FOO_ownercheck implies that a pg_BAR_ownercheck might exist, and if BAR is missing they might be inclined to re-add it. If we do keep them, would it make sense to go the extra step and turn the remaining six "regular" into static inline functions or even #define-s?
Re: ts_locale.c: why no t_isalnum() test?
On Wed, Oct 5, 2022 at 3:53 PM Tom Lane wrote: > I happened to wonder why various places are testing things like > > #define ISWORDCHR(c)(t_isalpha(c) || t_isdigit(c)) > > rather than using an isalnum-equivalent test. The direct answer > is that ts_locale.c/.h provides no such test function, which > apparently is because there's not a lot of potential callers in > the core code. However, both pg_trgm and ltree could benefit > from adding one. > > There's no semantic hazard here: the documentation I consulted > is all pretty explicit that is[w]alnum is true exactly when > either is[w]alpha or is[w]digit are. For example, POSIX saith > > The iswalpha() and iswalpha_l() functions shall test whether wc is a > wide-character code representing a character of class alpha in the > current locale, or in the locale represented by locale, respectively; > see XBD Locale. > > The iswdigit() and iswdigit_l() functions shall test whether wc is a > wide-character code representing a character of class digit in the > current locale, or in the locale represented by locale, respectively; > see XBD Locale. > > The iswalnum() and iswalnum_l() functions shall test whether wc is a > wide-character code representing a character of class alpha or digit > in the current locale, or in the locale represented by locale, > respectively; see XBD Locale. > > While I didn't try to actually measure it, these functions don't > look remarkably cheap. Doing char2wchar() twice when we only need > to do it once seems silly, and the libc functions themselves are > probably none too cheap for multibyte characters either. > > Hence, I propose the attached. I got rid of some places that were > unnecessarily checking pg_mblen before applying t_iseq(), too. > > regards, tom lane > > I see this is already committed, but I'm curious, why do t_isalpha and t_isdigit have the pair of /* TODO */ comments? This unfinished business isn't explained anywhere in the file.
Re: Getting rid of SQLValueFunction
On Fri, Sep 30, 2022 at 2:04 AM Michael Paquier wrote: > Hi all, > > I have bumped a few days ago on the fact that COERCE_SQL_SYNTAX > (introduced by 40c24bf) and SQLValueFunction are around to do the > exact same thing, as known as enforcing single-function calls with > dedicated SQL keywords. For example, keywords like SESSION_USER, > CURRENT_DATE, etc. go through SQLValueFunction and rely on the parser > to set a state that gets then used in execExprInterp.c. And it is > rather easy to implement incorrect SQLValueFunctions, as these rely on > more hardcoded assumptions in the parser and executor than the > equivalent FuncCalls (like collation to assign when using a text-like > SQLValueFunctions). > > There are two categories of single-value functions: > - The ones returning names, where we enforce a C collation in two > places of the code (current_role, role, current_catalog, > current_schema, current_database, current_user), even if > get_typcollation() should do that for name types. > - The ones working on time, date and timestamps (localtime[stamp], > current_date, current_time[stamp]), for 9 patterns as these accept an > optional typmod. > > I have dug into the possibility to unify all that with a single > interface, and finish with the attached patch set which is a reduction > of code, where all the SQLValueFunctions are replaced by a set of > FuncCalls: > 25 files changed, 338 insertions(+), 477 deletions(-) > > 0001 is the move done for the name-related functions, cleaning up two > places in the executor when a C collation is assigned to those > function expressions. 0002 is the remaining cleanup for the > time-related ones, moving a set of parser-side checks to the execution > path within each function, so as all this knowledge is now local to > each file holding the date and timestamp types. Most of the gain is > in 0002, obviously. > > The pg_proc entries introduced for the sake of the move use the same > name as the SQL keywords. These should perhaps be prefixed with a > "pg_" at least. There would be an exception with pg_localtime[stamp], > though, where we could use a pg_localtime[stamp]_sql for the function > name for prosrc. I am open to suggestions for these names. > > Thoughts? > -- > Michael > I like this a lot. Deleted code is debugged code. Patch applies and passes make check-world. No trace of SQLValueFunction is left in the codebase, at least according to `git grep -l`. I have only one non-nitpick question about the code: + /* + * we're not too tense about good error message here because grammar + * shouldn't allow wrong number of modifiers for TIME + */ + if (n != 1) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid type modifier"))); I agree that we shouldn't spend too much effort on a good error message here, but perhaps we should have the message mention that it is date/time-related? A person git-grepping for this error message will get 4 hits in .c files (date.c, timestamp.c, varbit.c, varchar.c) so even a slight variation in the message could save them some time. This is an extreme nitpick, but the patchset seems like it should have been 1 file or 3 (remove name functions, remove time functions, remove SQLValueFunction infrastructure), but that will only matter in the unlikely case that we find a need for SQLValueFunction but we want to leave the timestamp function as COERCE_SQL_SYNTAX.
Re: predefined role(s) for VACUUM and ANALYZE
> > Sounds good. Here's a new patch set with aclitem's typalign fixed. > Patch applies. Passes make check and make check-world. Test coverage seems adequate. Coding is very clear and very much in the style of the existing code. Any quibbles I have with the coding style are ones I have with the overall pg-style, and this isn't the forum for that. I haven't done any benchmarking yet, but it seems that the main question will be the impact on ordinary DML statements. I have no opinion about the design debate earlier in this thread, but I do think that this patch is ready and adds something concrete to the ongoing discussion.
WIP: Analyze whether our docs need more granular refentries.
In reviewing another patch, I noticed that the documentation had an xref to a fairly large page of documentation (create_table.sgml), and I wondered if that link was chosen because the original author genuinely felt the entire page was relevant, or merely because a more granular link did not exist at the time, and this link had been carried forward since then while the referenced page grew in complexity. In the interest of narrowing the problem down to a manageable size, I wrote a script (attached) to find all xrefs and rank them by criteria[1] that I believe hints at the possibility that the xrefs should be more granular than they are. I intend to use the script output below as a guide for manually reviewing the references and seeing if there are opportunities to guide the reader to the relevant section of those pages. In case anyone is curious, here is a top excerpt of the script output: file_name link_name link_count line_count num_refentries - -- -- -- ref/psql-ref.sgml app-psql 20 52151 ecpg.sgml ecpg-sql-allocate-descriptor 4 10101 17 ref/create_table.sgml sql-createtable 23 24371 ref/select.sgmlsql-select23 22071 ref/create_function.sgml sql-createfunction30 935 1 ref/alter_table.sgml sql-altertable12 17761 ref/pg_dump.sgml app-pgdump11 15451 ref/pg_basebackup.sgml app-pgbasebackup 11 10081 ref/create_type.sgml sql-createtype10 10291 ref/create_index.sgml sql-createindex 9 999 1 ref/postgres-ref.sgml app-postgres 10 845 1 ref/copy.sgml sql-copy 7 10811 ref/create_role.sgml sql-createrole13 511 1 ref/grant.sgml sql-grant 13 507 1 ref/create_foreign_table.sgml sql-createforeigntable14 455 1 ref/insert.sgmlsql-insert8 792 1 ref/pg_ctl-ref.sgmlapp-pg-ctl8 713 1 ref/create_trigger.sgmlsql-createtrigger 7 777 1 ref/set.sgml sql-set 15 332 1 ref/create_aggregate.sgml sql-createaggregate 6 805 1 ref/initdb.sgmlapp-initdb8 588 1 ref/create_policy.sgml sql-createpolicy 7 655 1 dblink.sgmlcontrib-dblink-connect1 213619 ref/create_subscription.sgml sql-createsubscription9 472 1 Some of these will clearly be false positives. For instance, dblink.sgml and ecpg.sgml have a lot of refentries, but they seem to lack a global "top" refentry which I assumed would be there. On the other hand, I have to wonder if the references to psql might be to a specific feature of the tool, and perhaps we can create refentries to those. [1] The criteria is: must be first refentry in file, file must be at least 200 lines long, then rank by lines*references, 2x for referencing the top refentry when others exist xref-analysis.sh Description: application/shellscript
Re: future of serial and identity columns
> > The feedback was pretty positive, so I dug through all the tests to at > least get to the point where I could see the end of it. The attached > patch 0001 is the actual code and documentation changes. The 0002 patch > is just tests randomly updated or disabled to make the whole suite pass. > This reveals that there are a few things that would warrant further > investigation, in particular around extensions and partitioning. To be > continued. > I like what I see so far! Question: the xref refers the reader to sql-createtable, which is a pretty big page, which could leave the reader lost. Would it make sense to create a SQL-CREATETABLE-IDENTITY anchor and link to that instead?
Re: [patch] \g with multiple result sets and \watch with copy queries
> > This is a bit more complicated than the usual tests, but not > that much. > Any opinions on this? +1 I think that because it is more complicated than usual psql, we may want to comment on the intention of the tests and some of the less-than-common psql elements (\set concatenation, resetting \o, etc). If you see value in that I can amend the patch. Are there any options on COPY (header, formats) that we think we should test as well?
Re: Error-safe user functions
> > > The idea is simple -- introduce new "error-safe" calling mode of user > functions by passing special node through FunctCallInfo.context, in > which function should write error info and return instead of throwing > it. Also such functions should manually free resources before > returning an error. This gives ability to avoid PG_TRY/PG_CATCH and > subtransactions. > > I tried something similar when trying to implement TRY_CAST ( https://learn.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver16) late last year. I also considered having a default datum rather than just returning NULL. I had not considered a new node type. I had considered having every function have a "safe" version, which would be a big duplication of logic requiring a lot of regression tests and possibly fuzzing tests. Instead, I extended every core input function to have an extra boolean parameter to indicate if failures were allowed, and then an extra Datum parameter for the default value. The Input function wouldn't need to check the value of the new parameters until it was already in a situation where it found invalid data, but the extra overhead still remained, and it meant that basically every third party type extension would need to be changed. Then I considered whether the cast failure should be completely silent, or if the previous error message should instead be omitted as a LOG/INFO/WARN, and if we'd want that to be configurable, so then the boolean parameter became an integer enum: * regular fail (0) * use default silently (1) * use default emit LOG/NOTICE/WARNING (2,3,4) At the time, all of this seemed like too big of a change for a function that isn't even in the SQL Standard, but maybe SQL/JSON changes that. If so, it would allow for a can-cast-to test that users would find very useful. Something like: SELECT CASE WHEN 'abc' CAN BE integer THEN 'Integer' ELSE 'Nope' END There's obviously no standard syntax to support that, but the data cleansing possibilities would be great.
Re: Query generates infinite loop
> > Less sure about that. ISTM the reason that the previous proposal failed > was that it introduced too much ambiguity about how to resolve > unknown-type arguments. Wouldn't the same problems arise here? > If I recall, the problem was that the lack of a date-specific generate_series function would result in a date value being coerced to timestamp, and thus adding generate_series(date, date, step) would change behavior of existing code, and that was a POLA violation (among other bad things). By adding a different function, there is no prior behavior to worry about. So we should be safe with the following signatures doing the right thing, yes?: generate_finite_series(start timestamp, step interval, num_elements integer) generate_finite_series(start date, step integer, num_elements integer) generate_finite_series(start date, step interval year to month, num_elements integer)
Re: Query generates infinite loop
On Mon, May 9, 2022 at 12:02 AM Tom Lane wrote: > Corey Huinker writes: > > On Wed, May 4, 2022 at 3:01 PM Jeff Janes wrote: > >> On Wed, Apr 20, 2022 at 5:43 PM Tom Lane wrote: > >>> Oh --- looks like numeric generate_series() already throws error for > >>> this, so we should just make the timestamp variants do the same. > > > This came up once before > > > https://www.postgresql.org/message-id/CAB7nPqQUuUh_W3s55eSiMnt901Ud3meF7f_96yPkKcqfd1ZaMg%40mail.gmail.com > > Oh! I'd totally forgotten that thread, but given that discussion, > and particularly the counterexample at > > https://www.postgresql.org/message-id/16807.1456091547%40sss.pgh.pa.us > > it now feels to me like maybe this change was a mistake. Perhaps > instead of the committed change, we ought to go the other way and > rip out the infinity checks in numeric generate_series(). > The infinite-upper-bound-withlimit-pushdown counterexample makes sense, but seems like we're using generate_series() only because we lack a function that generates a series of N elements, without a specified upper bound, something like generate_finite_series( start, step, num_elements ) And if we did that, I'd lobby that we have one that takes dates as well as one that takes timestamps, because that was my reason for starting the thread above.
Re: Query generates infinite loop
On Wed, May 4, 2022 at 3:01 PM Jeff Janes wrote: > On Wed, Apr 20, 2022 at 5:43 PM Tom Lane wrote: > >> I wrote: >> > it's true that infinities as generate_series endpoints are going >> > to work pretty oddly, so I agree with the idea of forbidding 'em. >> >> > Numeric has infinity as of late, so the numeric variant would >> > need to do this too. >> >> Oh --- looks like numeric generate_series() already throws error for >> this, so we should just make the timestamp variants do the same. >> > > The regression test you added for this change causes an infinite loop when > run against an unpatched server with --install-check. That is a bit > unpleasant. Is there something we can and should do about that? I was > expecting regression test failures of course but not an infinite loop > leading towards disk exhaustion. > > Cheers, > > Jeff > This came up once before https://www.postgresql.org/message-id/CAB7nPqQUuUh_W3s55eSiMnt901Ud3meF7f_96yPkKcqfd1ZaMg%40mail.gmail.com
Re: Window Function "Run Conditions"
On Tue, Mar 15, 2022 at 5:24 PM Greg Stark wrote: > This looks like an awesome addition. > > I have one technical questions... > > Is it possible to actually transform the row_number case into a LIMIT > clause or make the planner support for this case equivalent to it (in > which case we can replace the LIMIT clause planning to transform into > a window function)? > > The reason I ask is because the Limit plan node is actually quite a > bit more optimized than the general window function plan node. It > calculates cost estimates based on the limit and can support Top-N > sort nodes. > > But the bigger question is whether this patch is ready for a committer > to look at? Were you able to resolve Andy Fan's bug report? Did you > resolve the two questions in the original email? > +1 to all this It seems like this effort would aid in implementing what some other databases implement via the QUALIFY clause, which is to window functions what HAVING is to aggregate functions. example: https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause
Re: WIP: System Versioned Temporal Table
> > > The spec does not allow schema changes at all on a a system versioned > table, except to change the system versioning itself. > > That would greatly simplify things!
Re: WIP: System Versioned Temporal Table
> > > > 2. Putting data in a side table. This makes DROP SYSTEM VERSIONING > > fairly trivial, but it complicates many DDL commands (please make a > > list?) and requires the optimizer to know about this and cater to it, > > possibly complicating plans. Neither issue is insurmountable, but it > > becomes more intrusive. > > I'd vouch for this being the way to go; you completely sidestep issues > like partitioning, unique constraints, optimization, etc. Especially > true when 90% of the time, SELECTs will only be looking at > currently-active data. MDB seems to have gone with the single-table > approach (unless you partition) and I've run into a bug where I can't > add a unique constraint because historical data fails. > > System versioning & Application versioning > I saw that there is an intent to harmonize system versioning with > application versioning. Haven't read the AV thread so not positive if > that meant intending to split tables by application versioning and > system versioning both: to me it seems like maybe it would be good to > use a separate table for SV, but keep AV in the same table. Reasons > include: > The proposed AV uses just one table. > - ISO states only one AV config per table, but there's no reason this > always has to be the case; maybe you're storing products that are > active for a period of time, EOL for a period of time, and obsolete > for a period of time. If ISO sometime decides >1 AV config is OK, > there would be a mess trying to split that into tables. > The proposed AV (so far) allows for that. > - DB users who are allowed to change AV items likely won't be allowed > to rewrite history by changing SV items. My proposed schema would keep > these separate. > - Table schemas change, and all (SV active) AV items would logically > need to fit the active schema or be updated to do so. Different story > for SV, nothing there should ever need to be changed. > Yeah, there's a mess (which you state below) about what happens if you create a table and then rename a column, or drop a column and add a same-named column back of another type at a later date, etc. In theory, this means that the valid set of columns and their types changes according to the time range specified. I may not be remembering correctly, but Vik stated that the SQL spec seemed to imply that you had to track all those things. > - Partitioning for AV tables isn't as clear as with SV and is likely > better to be user-defined > So this was something I was asking various parties about at PgConf NYC just a few weeks ago. I am supposing that the main reason for SV is a regulatory concern, what tolerance to regulators have for the ability to manipulate the SV side-table? Is it possible to directly insert rows into one? If not, then moving rows into a new partition becomes impossible, and you'd be stuck with the partitioning strategy (if any) that you defined at SV creation time. The feedback I got was "well, you're already a superuser, if a regulator had a problem with that then they would have required that the SV table's storage be outside the server, either a foreign table, a csv foreign data wrapper of some sort, or a trigger writing to a non-db storage (which wouldn't even need SV). >From that, I concluded that every single AV partition would have it's own SV table, which could in turn be partitioned. In a sense, it might be helpful to think of the SV tables as partitions of the main table, and the period definition would effectively be the constraint that prunes the SV partition. > > Sorry for acronyms, SV=system versioning, AV=application versioning > > In general, I think AV should be treated literally as extra rows in > the main DB, plus the extra PK element and shortcut functions. SV > though, needs to have a lot more nuance. > > ALTER TABLE > On to ideas about how ALTER TABLE could work. I don't think the > question was ever answered "Do schema changes need to be tracked?" I'm > generally in favor of saying that it should be possible to recreate > the table exactly as it was, schema and all, at a specific period of > time (perhaps for a view) using a fancy combination of SELECT ... AS > and such - but it doesn't need to be straightforward. In any case, no > data should ever be deleted by ALTER TABLE. As someone pointed out > earlier, speed and storage space of ALTER TABLE are likely low > considerations for system versioned tables. > > - ADD COLUMN easy, add the column to both the current and historical > table, all null in historical > - DROP COLUMN delete the column from the current table. Historical is > difficult, because what happens if a new column with the same name is > added? Maybe `DROP COLUMN col1` would rename col1 to _col1_1642929683 > (epoch time) in the historical table or something like that. > - RENAME COLUMN is a bit tricky too - from a usability standpoint, the > historical table should be renamed as well. A quick thought is maybe > `RENAME col1 TO new_name` would
Re: Push down time-related SQLValue functions to foreign server
> > Hmm ... not really, because for these particular functions, the > point is exactly that we *don't* translate them to some function > call on the remote end. We evaluate them locally and push the > resulting constant to the far side, thus avoiding issues like > clock skew. > Ah, my pattern matching brain was so excited to see a use for routine mapping that I didn't notice that important detail.
Re: Push down time-related SQLValue functions to foreign server
> The implementation of converting now() to CURRENT_TIMESTAMP > seems like an underdocumented kluge, too. > I'm very late to the party, but it seems to me that this effort is describing a small subset of what "routine mapping" seems to be for: defining function calls that can be pushed down to the foreign server, and the analogous function on the foreign side. We may want to consider implementing just enough of CREATE ROUTINE MAPPING and DROP ROUTINE MAPPING to support these specific fixed functions.
Re: Add 64-bit XIDs into PostgreSQL 15
> > I'd be > curious to know where we found the bits for that -- the tuple header > isn't exactly replete with extra bit space. > +1 - and can we somehow shoehorn in a version # into the new format so we never have to look for spare bits again.
Re: SQL:2011 application time
> > > But > the standard says that dropping system versioning should automatically > drop all historical records (2 under Part 2: Foundation, 11.30 system versioning clause>). That actually makes sense though: when you > do DML we automatically update the start/end columns, but we don't > save copies of the previous data (and incidentally the end column will > always be the max value.) This is what I was referring to when I mentioned a side-table. deleting history would be an O(1) operation. Any other misunderstandings are all mine.
Re: Suggestion: optionally return default value instead of error on failed cast
On Thu, Jan 6, 2022 at 12:18 PM Andrew Dunstan wrote: > > On 1/4/22 22:17, Corey Huinker wrote: > > > > currently a failed cast throws an error. It would be useful to have a > > way to get a default value instead. > > > > > > I've recently encountered situations where this would have been > > helpful. Recently I came across some client code: > > > > CREATE OR REPLACE FUNCTION is_valid_json(str text) RETURNS boolean > > LANGUAGE PLPGSQL > > AS $$ > > DECLARE > > j json; > > BEGIN > > j := str::json; > > return true; > > EXCEPTION WHEN OTHERS THEN return false; > > END > > $$; > > > > > > This is a double-bummer. First, the function discards the value so we > > have to recompute it, and secondly, the exception block prevents the > > query from being parallelized. > > > This particular case is catered for in the SQL/JSON patches which > several people are currently reviewing: > > That's great to know, but it would still be parsing the json twice, once to learn that it is legit json, and once to get the casted value. Also, I had a similar issue with type numeric, so having generic "x is a type_y" support would essentially do everything that a try_catch()-ish construct would need to do, and be more generic.
Re: SQL:2011 application time
On Wed, Jan 5, 2022 at 11:07 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 21.11.21 02:51, Paul A Jungwirth wrote: > > Here are updated patches. They are rebased and clean up some of my > > TODOs. > > This patch set looks very interesting. It's also very big, so it's > difficult to see how to get a handle on it. I did a pass through it > to see if there were any obvious architectural or coding style > problems. I also looked at some of your TODO comments to see if I had > something to contribute there. > > I'm confused about how to query tables based on application time > periods. Online, I see examples using AS OF, but in the SQL standard > I only see this used for system time, which we are not doing here. > What is your understanding of that? > Paul has previously supplied me with this document https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf and that formed the basis of a lot of my questions a few months earlier. There was similar work being done for system periods, which are a bit simpler but require a side (history) table to be created. I was picking people's brains about some aspects of system versioning to see if I could help bringing that into this already very large patchset, but haven't yet felt like I had done enough research to post it. It is my hope that we can at least get the syntax for both application and system versioning committed, even if it's just stubbed in with not-yet-supported errors.
Re: Suggestion: optionally return default value instead of error on failed cast
> > currently a failed cast throws an error. It would be useful to have a > way to get a default value instead. > I've recently encountered situations where this would have been helpful. Recently I came across some client code: CREATE OR REPLACE FUNCTION is_valid_json(str text) RETURNS boolean LANGUAGE PLPGSQL AS $$ DECLARE j json; BEGIN j := str::json; return true; EXCEPTION WHEN OTHERS THEN return false; END $$; This is a double-bummer. First, the function discards the value so we have to recompute it, and secondly, the exception block prevents the query from being parallelized. > > T-SQL has try_cast [1] > I'd be more in favor of this if we learn that there's no work (current or proposed) in the SQL standard. > Oracle has CAST(... AS .. DEFAULT ... ON CONVERSION ERROR) [2] > If the SQL group has suggested anything, I'd bet it looks a lot like this. > > The DEFAULT ... ON CONVERSION ERROR syntax seems like it could be > implemented in PostgreSQL. Even if only DEFAULT NULL was supported (at > first) that would already help. > > The short syntax could be extended for the DEFAULT NULL case, too: > > SELECT '...'::type -- throws error > SELECT '...':::type -- returns NULL > I think I'm against adding a ::: operator, because too many people are going to type (or omit) the third : by accident, and that would be a really subtle bug. The CAST/TRY_CAST syntax is wordy but it makes it very clear that you expected janky input and have specified a contingency plan. The TypeCast node seems like it wouldn't need too much modification to allow for this. The big lift, from what I can tell, is either creating versions of every $foo_in() function to return NULL instead of raising an error, and then effectively wrapping that inside a coalesce() to process the default. Alternatively, we could add an extra boolean parameter ("nullOnFailure"? "suppressErrors"?) to the existing $foo_in() functions, a boolean to return null instead of raising an error, and the default would be handled in coerce_to_target_type(). Either of those would create a fair amount of work for extensions that add types, but I think the value would be worth it. I do remember when I proposed the "void"/"black hole"/"meh" datatype (all values map to NULL) I ran into a fairly fundamental rule that types must map any not-null input to a not-null output, and this could potentially violate that, but I'm not sure. Does anyone know if the SQL standard has anything to say on this subject?
Re: Foreign key joins revisited
> > > First, there is only one FK in permission pointing to role, and we write a > query leaving out the key columns. > Then, another different FK in permission pointing to role is later added, > and our old query is suddenly in trouble. > > We already have that problem with cases where two tables have a common x column: SELECT x FROM a, b so this would be on-brand for the standards body. And worst case scenario you're just back to the situation you have now.
Re: Foreign key joins revisited
> > > > Perhaps this would be more SQL idiomatic: > > FROM permission p >LEFT JOIN ON KEY role IN p AS r >LEFT JOIN team_role AS tr ON KEY role TO r >LEFT JOIN ON KEY team IN tr AS t >LEFT JOIN user_role AS ur ON KEY role TO r >LEFT JOIN ON KEY user IN ur AS u > > My second guess would be: FROM permission p LEFT JOIN role AS r ON [FOREIGN] KEY [(p.col1 [, p.col2 ...])] where the key spec is only required when there are multiple foreign keys in permission pointing to role. But my first guess would be that the standards group won't get around to it.
Re: simplifying foreign key/RI checks
> > > > Good catch, thanks. Patch updated. > > > Applies clean. Passes check-world.
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
> > Out of curiosity, could you please tell me the concrete situations > where you wanted to delete one of two identical records? > In my case, there is a table with known duplicates, and we would like to delete all but the one with the lowest ctid, and then add a unique index to the table which then allows us to use INSERT ON CONFLICT in a meaningful way. The other need for a DELETE...LIMIT or UPDATE...LIMIT is when you're worried about flooding a replica, so you parcel out the DML into chunks that don't cause unacceptable lag on the replica. Both of these can be accomplished via DELETE FROM foo WHERE ctid IN ( SELECT ... FROM foo ... LIMIT 1000), but until recently such a construct would result in a full table scan, and you'd take the same hit with each subsequent DML. I *believe* that the ctid range scan now can limit those scans, especially if you can order the limited set by ctid, but those techniques are not widely known at this time.
Re: simplifying foreign key/RI checks
> > > > I wasn't able to make much inroads into how we might be able to get > rid of the DETACH-related partition descriptor hacks, the item (3), > though I made some progress on items (1) and (2). > > For (1), the attached 0001 patch adds a new isolation suite > fk-snapshot.spec to exercise snapshot behaviors in the cases where we > no longer go through SPI. It helped find some problems with the > snapshot handling in the earlier versions of the patch, mainly with > partitioned PK tables. It also contains a test along the lines of the > example you showed upthread, which shows that the partition descriptor > hack requiring ActiveSnapshot to be set results in wrong results. > Patch includes the buggy output for that test case and marked as such > in a comment above the test. > > In updated 0002, I fixed things such that the snapshot-setting > required by the partition descriptor hack is independent of > snapshot-setting of the RI query such that it no longer causes the PK > index scan to return rows that the RI query mustn't see. That fixes > the visibility bug illustrated in your example, and as mentioned, also > exercised in the new test suite. > > I also moved find_leaf_pk_rel() into execPartition.c with a new name > and a new set of parameters. > > -- > Amit Langote > EDB: http://www.enterprisedb.com Sorry for the delay. This patch no longer applies, it has some conflict with d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a
Re: Getting rid of regression test input/ and output/ files
On Sun, Dec 19, 2021 at 7:00 PM Tom Lane wrote: > Corey Huinker writes: > > Which brings up a tangential question, is there value in having something > > that brings in one or more env vars as psql vars directly. I'm thinking > > something like: > > > \importenv pattern [prefix] > > Meh ... considering we've gone this long without any getenv capability > in psql at all, that seems pretty premature. > > regards, tom lane > Fair enough. Patches didn't apply with `git apply` but did fine with `patch -p1`, from there it passes make check-world.
Re: Getting rid of regression test input/ and output/ files
On Sun, Dec 19, 2021 at 5:48 PM Tom Lane wrote: > Corey Huinker writes: > > I have a nitpick about the \getenv FOO FOO lines. > > It's a new function to everyone, and to anyone who hasn't seen the > > documentation it won't be immediately obvious which one is the ENV var > and > > which one is the local var. Lowercasing the local var would be a way to > > reinforce which is which to the reader. It would also be consistent with > > var naming in the rest of the script. > > Reasonable idea. Another thing I was wondering about was whether > to attach PG_ prefixes to the environment variable names, since > those are in a more-or-less global namespace. If we do that, > then a different method for distinguishing the psql variables > is to not prefix them. +1 to that as well. Which brings up a tangential question, is there value in having something that brings in one or more env vars as psql vars directly. I'm thinking something like: \importenv pattern [prefix] (alternate names: \getenv_multi \getenv_pattern, \getenvs, etc) which could be used like \importenv PG* env_ which would import PGFOO and PGBAR as env_PGFOO and env_PGBAR, awkward names but leaving no doubt about where a previously unreferenced variable came from. I don't *think* we need it for this specific case, but since the subject of env vars has come up I thought I'd throw it out there.
Re: Getting rid of regression test input/ and output/ files
> > > 0001 adds the \getenv command to psql; now with documentation > and a simple regression test. > +1. Wish I had added this years ago when I had a need for it. > > 0002 tweaks pg_regress to export the needed values as environment > variables, and modifies the test scripts to use those variables. > (For ease of review, this patch modifies the scripts in-place, > and then 0003 will move them.) A few comments on this: > > * I didn't see any value in exporting @testtablespace@ as a separate > variable; we might as well let the test script know how to construct > that path name. > > * I concluded that the right way to handle the concatenation issue > is *not* to rely on SQL literal concatenation, but to use psql's > \set command to concatenate parts of a string. In particular this > +1 to that, much better than the multi-line thing. I have a nitpick about the \getenv FOO FOO lines. It's a new function to everyone, and to anyone who hasn't seen the documentation it won't be immediately obvious which one is the ENV var and which one is the local var. Lowercasing the local var would be a way to reinforce which is which to the reader. It would also be consistent with var naming in the rest of the script. > > 0004 finally removes the no-longer-needed infrastructure in > +1 Deleted code is debugged code.
Re: Add id's to various elements in protocol.sgml
On Sun, Dec 5, 2021 at 11:15 AM Daniel Gustafsson wrote: > > On 5 Dec 2021, at 16:51, Brar Piening wrote: > > > The attached patch adds id's to various elements in protocol.sgml to > > make them more accesssible via the public html documentation interface. > > Off the cuff without having checked the compiled results yet, it seems > like a good idea. > > — > Daniel Gustafsson > I wanted to do something similar for every function specification in the docs. This may inspire me to take another shot at that.
Re: automatically generating node support functions
> > build support and made the Perl code more portable, so that the cfbot > doesn't have to be sad. > Was this also the reason for doing the output with print statements rather than using one of the templating libraries? I'm mostly just curious, and certainly don't want it to get in the way of working code.
Re: WIP: System Versioned Temporal Table
On Sun, Sep 19, 2021 at 3:12 PM Hannu Krosing wrote: > A side table has the nice additional benefit that we can very easily > version the *table structure* so when we ALTER TABLE and the table > structure changes we just make a new side table with now-currents > structure. > It's true that would allow for perfect capture of changes to the table structure, but how would you query the thing? If a system versioned table was created with a column foo that is type float, and then we dropped that column, how would we ever query what the value of foo was in the past? Would the columns returned from SELECT * change based on the timeframe requested? If we then later added another column that happened to also be named foo but now was type JSONB, would we change the datatype returned based on the time period being queried? Is the change in structure a system versioning which itself must be captured? > Also we may want different set of indexes on historic table(s) for > whatever reason > +1 > > And we may even want to partition history tables for speed, storage > cost or just to drop very ancient history > +1
Re: WIP: System Versioned Temporal Table
> > Thanks for giving this a lot of thought. When you asked the question > the first time you hadn't discussed how that might work, but now we > have something to discuss. > My ultimate goal is to unify this effort with the application period effort. Step 1 in that was to understand what each was doing and why they were doing it. If you check out the other thread, you'll see a highly similar message that I sent over there. > There are 3 implementation routes that I see, so let me explain so > that others can join the discussion. > > 1. Putting all data in one table. This makes DROP SYSTEM VERSIONING > effectively impossible. It requires access to the table to be > rewritten to add in historical quals for non-historical access and it > requires some push-ups around indexes. (The current patch adds the > historic quals by kludging the parser, which is wrong place, since it > doesn't work for joins etc.. However, given that issue, the rest seems > to follow on naturally). > > 2. Putting data in a side table. This makes DROP SYSTEM VERSIONING > fairly trivial, but it complicates many DDL commands (please make a > list?) and requires the optimizer to know about this and cater to it, > possibly complicating plans. Neither issue is insurmountable, but it > becomes more intrusive. > > The current patch could go in either of the first 2 directions with > further work. > > 3. Let the Table Access Method handle it. I call this out separately > since it avoids making changes to the rest of Postgres, which might be > a good thing, with the right TAM implementation. > I'd like to hear more about this idea number 3. I could see value in allowing the history table to be a foreign table, perhaps writing to csv/parquet/whatever files, and that sort of setup could be persuasive to a regulator who wants extra-double-secret-proof that auditing cannot be tampered with. But with that we'd have to give up the relkind idea, which itself was going to be a cheap way to prevent updates outside of the system triggers.
Re: Undocumented AT TIME ZONE INTERVAL syntax
> > >> Yeah, I really didn't expect to change the behavior, but wanted to make > sure that the existing behavior was understood. I'll whip up a patch. > Attached is an attempt at an explanation of the edge cases I was encountering, as well as some examples. If nothing else, the examples will draw eyes and searches to the explanations that were already there. From 618a7fbd5606510b993697a0a1968fde5f02fbb2 Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Sun, 19 Sep 2021 22:34:42 -0400 Subject: [PATCH 1/2] Explain some of the nuances with INTERVAL data when the string literal offers fewer positions of information than than the fields spefication requires. --- doc/src/sgml/datatype.sgml | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 50a2c8e5f1..83ebb68333 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -2816,7 +2816,30 @@ P years-months- defined with a fields specification, the interpretation of unmarked quantities depends on the fields. For example INTERVAL '1' YEAR is read as 1 year, whereas - INTERVAL '1' means 1 second. Also, field values + INTERVAL '1' means 1 second. If the string provided contains + a : then the number to the left of the first : + will be used to fill in the most significant sub-day part of the + fields speficification. + + +SELECT INTERVAL '2' HOUR TO SECOND; + interval +-- + 00:00:02 + +SELECT INTERVAL '4:2' HOUR TO SECOND; + interval +-- + 04:02:00 + + +SELECT INTERVAL '2:' DAY TO SECOND; + interval +-- + 02:00:00 + + + Also, field values to the right of the least significant field allowed by the fields specification are silently discarded. For example, writing INTERVAL '1 day 2:03:04' HOUR TO MINUTE -- 2.30.2 From 042467c84a4f9d3cbbdb7660b0b174d99dadde9d Mon Sep 17 00:00:00 2001 From: coreyhuinker Date: Sun, 19 Sep 2021 23:26:28 -0400 Subject: [PATCH 2/2] Add example to show the effect of AT TIME ZONE INTERVAL. --- doc/src/sgml/func.sgml | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 78812b2dbe..eee10f2db9 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -10435,6 +10435,9 @@ SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'America/D SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'Asia/Tokyo' AT TIME ZONE 'America/Chicago'; Result: 2001-02-16 05:38:40 + +SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'UTC' AT TIME ZONE INTERVAL '3:21:20' HOUR TO SECOND; +Result: 2001-02-17 00:00:00+00 The first example adds a time zone to a value that lacks it, and displays the value using the current TimeZone @@ -10442,7 +10445,8 @@ SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'Asia/Tokyo' AT TIME ZONE 'A to the specified time zone, and returns the value without a time zone. This allows storage and display of values different from the current TimeZone setting. The third example converts -Tokyo time to Chicago time. +Tokyo time to Chicago time. The fourth example adds the time zone UTC +to a value that lacks it, and then applies a highly contrived fixed offset. -- 2.30.2
Re: Undocumented AT TIME ZONE INTERVAL syntax
On Sun, Sep 19, 2021 at 10:56 AM Tom Lane wrote: > Corey Huinker writes: > >> SELECT '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE; > > > ... But none of this is in our own documentation. > > That's not entirely true. [1] says > > When writing an interval constant with a fields specification, or when > assigning a string to an interval column that was defined with a > fields specification, the interpretation of unmarked quantities > depends on the fields. For example INTERVAL '1' YEAR is read as 1 > year, whereas INTERVAL '1' means 1 second. Also, field values “to the > right” of the least significant field allowed by the fields > specification are silently discarded. For example, writing INTERVAL '1 > day 2:03:04' HOUR TO MINUTE results in dropping the seconds field, but > not the day field. > That text addresses the case of the unadorned string (seconds) and the overflow case (more string values than places to put them), but doesn't really address the underflow. > > But I'd certainly agree that a couple of examples are not a specification. > Looking at DecodeInterval, it looks like the rule is that unmarked or > ambiguous fields are matched to the lowest field mentioned by the typmod > restriction. Thus > > regression=# SELECT INTERVAL '4:2' HOUR TO MINUTE; > interval > -- > 04:02:00 > (1 row) > > regression=# SELECT INTERVAL '4:2' MINUTE TO SECOND; > interval > -- > 00:04:02 > (1 row) # SELECT INTERVAL '04:02' HOUR TO SECOND; interval -- 04:02:00 This result was a bit unexpected, and the existing documentation doesn't address underflow cases like this. So, restating all this to get ready to document it, the rule seems to be: 1. Integer strings with no spaces or colons will always apply to the rightmost end of the restriction given, lack of a restriction means seconds. Example: # SELECT INTERVAL '2' HOUR TO SECOND, INTERVAL '2' HOUR TO MINUTE, INTERVAL '2'; interval | interval | interval --+--+-- 00:00:02 | 00:02:00 | 00:00:02 (1 row) 2. Strings with time context (space separator for days, : for everything else) will apply starting with the leftmost part of the spec that fits, continuing to the right until string values are exhausted. Examples: # SELECT INTERVAL '4:2' HOUR TO SECOND, INTERVAL '4:2' DAY TO SECOND; interval | interval --+-- 04:02:00 | 04:02:00 (1 row) > If you wanted to improve this para it'd be cool with me. > I think people's eyes are naturally drawn to the example tables, and because the rules for handling string underflow are subtle, I think a few concrete examples are the way to go. > > > Before I write a patch to add this to the documentation, I'm curious what > > level of sloppiness we should tolerate in the interval calculation. > Should > > we enforce the time string to actually conform to the format laid out in > > the X TO Y spec? > > We have never thrown away high-order fields: > And with the above I'm now clear that we're fine with the existing behavior for underflow. > > I'm not sure what the SQL spec says here, but I'd be real hesitant to > change the behavior of cases that we've accepted for twenty-plus > years, unless they're just obviously insane. Which these aren't IMO. > Yeah, I really didn't expect to change the behavior, but wanted to make sure that the existing behavior was understood. I'll whip up a patch.
Undocumented AT TIME ZONE INTERVAL syntax
In reviewing Paul's application period patch, I noticed some very curious syntax in the test cases. I learned that Paul is equally confused by it, and has asked about it in his PgCon 2020 presentation > SELECT '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE; timezone - 2018-03-04 05:02:00 (1 row) Searching around, I found several instances of this syntax being used [1][2][3], but with one important clarifying difference: the expected syntax was > SELECT '2018-03-04' AT TIME ZONE INTERVAL '2:00' HOUR TO MINUTE; timezone - 2018-03-04 07:00:00 (1 row) Now I understand that the user probably meant to do this: # SELECT '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR; timezone - 2018-03-04 07:00:00 (1 row) But none of this is in our own documentation. Before I write a patch to add this to the documentation, I'm curious what level of sloppiness we should tolerate in the interval calculation. Should we enforce the time string to actually conform to the format laid out in the X TO Y spec? If we don't require that, is it correct to say that the values will be filled from order of least significance to greatest? [1] https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/DataTypes/Date-Time/TIMESTAMPATTIMEZONE.htm [2] https://docs.teradata.com/r/kmuOwjp1zEYg98JsB8fu_A/aWY6mGNJ5CYJlSDrvgDQag [3] https://community.snowflake.com/s/question/0D50Z9AqIaSSAV/is-it-possible-to-add-an-interval-of-5-hours-to-the-session-timezone-
Re: SQL:2011 application time
In IBM DB2 you can only have one because application-time periods must > be named "business_time" (not joking). > I saw that as well, and it made me think that someone at IBM is a fan of Flight Of The Conchords. > Personally I feel like it's a weird limitation and I wouldn't mind > supporting more, but my current implementation only allows for one, > and I'd have to rethink some things to do it differently. > I'm satisfied that it's not something we need to do in the first MVP. > > Yes. Even though the name "SYSTEM_TIME" is technically enough, I'd > still include a pertype column to make distinguishing system vs > application periods easier and more obvious. > SYSTEM_TIME seems to allow for DATE values in the start_time and end_time fields, though I cannot imagine how that would ever be practical, unless it were somehow desirable to reject subsequent updates within a 24 hour timeframe. I have seen instances where home-rolled application periods used date values, which had similar problems where certain intermediate updates would simply have to be discarded in favor of the one that was still standing at midnight. > > > 2. The system versioning effort has chosen 'infinity' as their end-time > value, whereas you have chosen NULL as that makes sense for an unbounded > range. Other databases seem to leverage '-12-31 23:59:59' (SQLServer, > IIRC) whereas some others seem to used '2999-12-31 23:59:59' but those > might have been home-rolled temporal implementations. To further add to the > confusion, the syntax seems to specify the keyword of MAXVALUE, which > further muddies things. The system versioning people went with 'infinity' > seemingly because it prescribe and end to the world like SQLServer did, but > also because it allowed for a primary key based on (id, endtime) and that's > just not possible with NULL endtime values. > > I think it's a little weird that our system-time patch mutates your > primary key. None of the other RDMBSes do that. I don't think it's > incompatible (as long as the system time patch knows how to preserve > the extra period/range data in an application-time temporal key), but > it feels messy to me. > Per outline below, I'm proposing an alternate SYSTEM_TIME implementation that would leave the PK as-is. > I would prefer if system-time and application-time used the same value > to mean "unbounded". Using null means we can support any type (not > just types with +-Infinity). And it pairs nicely with range types. If > the only reason for system-time to use Infinity is the primary key, I > think it would be better not to mutate the primary key (and store the > historical records in a separate table as other RDMSes do). > The two "big wins" of infinity seemed (to me) to be: 1. the ability to add "AND end_time = 'infinity'" as a cheap way to get current rows 2. clauses like "WHERE CURRENT_DATE - 3 BETWEEN start_time AND end_time" would work. Granted, there's very specific new syntax to do that properly, but you know somebody's gonna see the columns and try to do it that way. > > Btw Oracle also uses NULL to mean "unbounded". > Huh, I missed that one. That is good in that it gives some precedence to how you've approached it. > > We presently forbid PKs from including expressions, but my patch lifts > that exception so it can index a rangetype expression built from the > period start & end columns. So even if we must include the system-time > end column in a PK, perhaps it can use a COALESCE expression to store > Infinity even while using NULL to signify "currently true" from a user > perspective. > Either way seems viable, but I understand why you want to leverage ranges in this way. > > > 3. I noticed some inconsistency in the results from various "SELECT * > FROM portion_of_test" examples. In some, the "valid_at" range is shown but > not columns that make it up, and in some others, the "valid_from" and > "valid_to" columns are shown, with no mention of the period. From what I've > seen, the period column should be invisible unless invoked, like ctid or > xmin. > > In most cases the tests test the same functionality with both PERIODs > and rangetype columns. For FKs they test all four combinations of > PERIOD-referencing-PERIOD, PERIOD-referencing-range, > range-referencing-PERIOD, and range-referencing-range. If valid_at is > a genuine column, it is included in SELECT *, but not if it is a > PERIOD. > Ok, I'll have to look back over the test coverage to make sure that I understand the behavior now. > > > 4. The syntax '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE > simply confounded me. > > Me too! I have no idea what that is supposed to mean. But that > behavior predates my patch. I only had to deal with it because it > creates a shift-reduce conflict with `FOR PORTION OF valid_at FROM x > TO y`, where x & y are expressions. I asked about this syntax at my > PgCon 2020 talk, but I haven't ever received an answer. Perhaps > someone else knows
Re: WIP: System Versioned Temporal Table
> > > > 1. Much of what I have read about temporal tables seemed to imply or > almost assume that system temporal tables would be implemented as two > actual separate tables. Indeed, SQLServer appears to do it that way [1] > with syntax like > > WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory)); > > > Q 1.1. Was that implementation considered and if so, what made this > implementation more appealing? > > I've been digging some more on this point, and I've reached the conclusion that a separate history table is the better implementation. It would make the act of removing system versioning into little more than a DROP TABLE, plus adjusting the base table to reflect that it is no longer system versioned. What do you think of this method: 1. The regular table remains unchanged, but a pg_class attribute named "relissystemversioned" would be set to true 2. I'm unsure if the standard allows dropping a column from a table while it is system versioned, and the purpose behind system versioning makes me believe the answer is a strong "no" and requiring DROP COLUMN to fail on relissystemversioned = 't' seems pretty straightforward. 3. The history table would be given a default name of $FOO_history (space permitting), but could be overridden with the history_table option. 4. The history table would have relkind = 'h' 5. The history table will only have rows that are not current, so it is created empty. 6. As such, the table is effectively append-only, in a way that vacuum can actually leverage, and likewise the fill factor of such a table should never be less than 100. 7. The history table could only be updated only via system defined triggers (insert,update,delete, alter to add columns), or row migration similar to that found in partitioning. It seems like this would work as the two tables working as partitions of the same table, but presently we can't have multi-parent partitions. 8. The history table would be indexed the same as the base table, except that all unique indexes would be made non-unique, and an index of pk + start_time + end_time would be added 9. The primary key of the base table would remain the existing pk vals, and would basically function normally, with triggers to carry forth changes to the history table. The net effect of this is that the end_time value of all rows in the main table would always be the chosen "current" value (infinity, null, -12-31, etc) and as such might not actually _need_ to be stored. 10. Queries that omit the FOR SYSTEM_TIME clause, as well as ones that use FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, would simply use the base table directly with no quals to add. 11. Queries that use FOR SYSTEM_TIME and not FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, then the query would do a union of the base table and the history table with quals applied to both. 12. It's a fair question whether the history table would be something that could be queried directly. I'm inclined to say no, because that allows for things like SELECT FOR UPDATE, which of course we'd have to reject. 13. If a history table is directly referenceable, then SELECT permission can be granted or revoked as normal, but all insert/update/delete/truncate options would raise an error. 14. DROP SYSTEM VERSIONING from a table would be quite straightforward - the history table would be dropped along with the triggers that reference it, setting relissystemversioned = 'f' on the base table. I think this would have some key advantages: 1. MVCC bloat is no worse than it was before. 2. No changes whatsoever to referential integrity. 3. DROP SYSTEM VERSIONING becomes an O(1) operation. Thoughts? I'm going to be making a similar proposal to the people doing the application time effort, but I'm very much hoping that we can reach some consensus and combine efforts.
Re: WIP: System Versioned Temporal Table
On Sun, Sep 12, 2021 at 12:02 PM Simon Riggs wrote: > On Fri, 10 Sept 2021 at 19:30, Jaime Casanova > wrote: > > > > On Tue, Aug 10, 2021 at 01:20:14PM +0100, Simon Riggs wrote: > > > On Wed, 14 Jul 2021 at 12:48, vignesh C wrote: > > > > > > > The patch does not apply on Head anymore, could you rebase and post a > > > > patch. I'm changing the status to "Waiting for Author". > > > > > > OK, so I've rebased the patch against current master to take it to v15. > > > > > > I've then worked on the patch some myself to make v16 (attached), > > > adding these things: > > > > > > > Hi Simon, > > > > This one doesn't apply nor compile anymore. > > Can we expect a rebase soon? > > Hi Jaime, > > Sorry for not replying. > > Yes, I will rebase again to assist the design input I have requested. > Please expect that on Sep 15. > > Cheers > > -- > Simon Riggshttp://www.EnterpriseDB.com/ > > > I've been interested in this patch, especially with how it will interoperate with the work on application periods in https://www.postgresql.org/message-id/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mdhcy4_qq0+noc...@mail.gmail.com . I've written up a few observations and questions in that thread, and wanted to do the same here, as the questions are a bit narrower but no less interesting. 1. Much of what I have read about temporal tables seemed to imply or almost assume that system temporal tables would be implemented as two actual separate tables. Indeed, SQLServer appears to do it that way [1] with syntax like WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory)); Q 1.1. Was that implementation considered and if so, what made this implementation more appealing? 2. The endtime column constraint which enforces GENERATED ALWAYS AS ROW END seems like it would have appeal outside of system versioning, as a lot of tables have a last_updated column, and it would be nice if it could handle itself and not rely on fallible application programmers or require trigger overhead. Q 2.1. Is that something we could break out into its own patch? 3. It is possible to have bi-temporal tables (having both a system_time period and a named application period) as described in [2], the specific example was CREATE TABLE Emp( ENo INTEGER, EStart DATE, EEnd DATE, EDept INTEGER, PERIOD FOR EPeriod (EStart, EEnd), Sys_start TIMESTAMP(12) GENERATED ALWAYS AS ROW START, Sys_end TIMESTAMP(12) GENERATED ALWAYS AS ROW END, EName VARCHAR(30), PERIOD FOR SYSTEM_TIME(Sys_start, Sys_end), PRIMARY KEY (ENo, EPeriod WITHOUT OVERLAPS), FOREIGN KEY (Edept, PERIOD EPeriod) REFERENCES Dept (DNo, PERIOD DPeriod) ) WITH SYSTEM VERSIONING What's interesting here is that in the case of a bitemporal table, it was the application period that got the defined primary key. The paper went on that only the _current_ rows of the table needed to be unique for, as it wasn't possible to create rows with past system temporal values. This sounds like a partial index to me, and luckily postgres can do referential integrity on any unique index, not just primary keys. In light of the assumption of a history side-table, I guess I shouldn't be surprised. Q 3.1. Do you think that it would be possible to implement system versioning with just a unique index? Q 3.2. Are there any barriers to using a partial index as the hitch for a foreign key? Would it be any different than the implied "and endtime = 'infinity'" that's already being done? 4. The choice of 'infinity' seemed like a good one initially - it's not null so it can be used in a primary key, it's not some hackish magic date like SQLServer's '-12-31 23:59:59.999'. However, it may not jibe as well with application versioning, which is built very heavily upon range types (and multirange types), and those ranges are capable of saying that a record is valid for an unbounded amount of time in the future, that's represented with NULL, not infinity. It could be awkward to have the system endtime be infinity and the application period endtime be NULL. Q 4.1. Do you have any thoughts about how to resolve this? 5. System versioning columns were indicated with additional columns in pg_attribute. Q 5.1. If you were to implement application versioning yourself, would you just add additional columns to pg_attribute for those? 6. The current effort to implement application versioning creates an INFORMATION_SCHEMA view called PERIODS. I wasn't aware of this one before but there seems to be precedent for it existing. Q 6.1. Would system versioning belong in such a view? 7. This is a trifle, but the documentation is inconsistent about starttime vs StartTime and endtime vs EndTime. 8. Overall, I'm really excited about both of these efforts, and I'm looking for ways to combine the efforts, perhaps starting with a patch that implements the SQL syntax, but raises not-implemented errors, and each effort could then build off of that. [1]
Re: SQL:2011 application time
So I've been eagerly watching this thread and hoping to have time to devote to it. I've also been looking at the thread at https://www.postgresql.org/message-id/calay4q8pp699qv-pjzc4tos-e2nzrjkrvax-xqg1aqj2q+w...@mail.gmail.com that covers system versioning, and per our conversation far too long ago (again, my bad) it's obvious that the two efforts shouldn't do anything that would be in conflict with one another, as we eventually have to support bitemporal [1] tables: tables that have both system versioning and an application period. Below is a list of observations and questions about this proposed patch of itself in isolation, but mostly about how it relates to the work being done for system versioning. 1. This patch creates a pg_period catalog table, whereas the system versioning relies on additions to pg_attribute to identify the start/end columns. Initially I thought this was because it was somehow possible to have *multiple* application periods defined on a table, but in reading [1] I see that there are some design suppositions that would make a second application period impossible[2]. I can also see where having this table would facilitate the easy creation of INFORMATION_SCHEMA.PERIODS. I was previously unaware that this info schema table was a thing, but I have found references to it, though I'm unclear as to whether it's supposed to have information about system versioned tables in it as well. Q 1.1. Would a bitemporal table have two entries in that view? Q 1.2. Could you see being able to implement this without pg_period, using only additions to pg_attribute (start/end for system temporal, start/end for application, plus an addition for period name)? Q 1.3. Can you see a way to represent the system versioning in pg_period such that bitemporal tables were possible? 2. The system versioning effort has chosen 'infinity' as their end-time value, whereas you have chosen NULL as that makes sense for an unbounded range. Other databases seem to leverage '-12-31 23:59:59' (SQLServer, IIRC) whereas some others seem to used '2999-12-31 23:59:59' but those might have been home-rolled temporal implementations. To further add to the confusion, the syntax seems to specify the keyword of MAXVALUE, which further muddies things. The system versioning people went with 'infinity' seemingly because it prescribe and end to the world like SQLServer did, but also because it allowed for a primary key based on (id, endtime) and that's just not possible with NULL endtime values. Q 2.1. Do you have any thoughts about how to resolve this notational logjam? 3. I noticed some inconsistency in the results from various "SELECT * FROM portion_of_test" examples. In some, the "valid_at" range is shown but not columns that make it up, and in some others, the "valid_from" and "valid_to" columns are shown, with no mention of the period. From what I've seen, the period column should be invisible unless invoked, like ctid or xmin. 4. The syntax '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE simply confounded me. I googled around for it, but could find no matches for postgres exception in mailing list discussions circa 2003. I tried it out myself and, lo and behold # SELECT '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE; timezone - 2018-03-04 05:02:00 (1 row) I really didn't expect that to work, or even "work". I can see that it added 2 minutes to UTC's perspective on my local concept of midnight, but I don't understand what it's supposed to mean. Q 4.1. What does it mean? 5. I haven't seen any actual syntax conflicts between this patch and the system versioning patch. Both teams added basically the same keywords, though I haven't dove more deeply into any bison incompatibilities. Still, it's a great start. 6. Overall, I'm really excited about what this will mean for data governance in postgres. [1] https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf [2] In the bitemporal table example in [1] - the application period get the defined primary key, and the system_time period would be merely unique On Mon, Sep 13, 2021 at 12:12 AM Paul A Jungwirth < p...@illuminatedcomputing.com> wrote: > On Fri, Sep 10, 2021 at 6:50 PM Jaime Casanova > wrote: > > > > patch 01: does apply but gives a compile warning (which is fixed by patch > > 02) > > [snip] > > patch 03: produces these compile errors. > > I did a rebase and fixed this new error, as well as the warnings. > > On Mon, Sep 6, 2021 at 1:40 PM Zhihong Yu wrote: > > > > + * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group > > > > It seems the year (2018) should be updated to 2021. > > Done. > > > For RemovePeriodById(), it seems table_open() can be called after > SearchSysCache1(). This way, if HeapTupleIsValid(tup) is true, table_open() > can be skipped. > > This seems like it permits a race condition when two connections both > try to drop the period, right? > > > For tablecmds.c,
Re: simplifying foreign key/RI checks
> > Rebased patches attached. I'm reviewing the changes since v6, which was my last review. Making ExecLockTableTuple() it's own function makes sense. Snapshots are now accounted for. The changes that account for n-level partitioning makes sense as well. Passes make check-world. Not user facing, so no user documentation required. Marking as ready for committer again.
Nitpick/question: Use of aliases for global variables in functions
I'm using an ongoing patch review to educate myself on parts of the codebase. In src/backend/access/transam/xact.c, I'm noticing a code style inconsistency. In some cases, a function will declare a variable of some struct pointer type, assign it to a globally declared struct, and then use it like this: bool IsTransactionState(void) { TransactionState s = CurrentTransactionState; /* * TRANS_DEFAULT and TRANS_ABORT are obviously unsafe states. However, we * also reject the startup/shutdown states TRANS_START, TRANS_COMMIT, * TRANS_PREPARE since it might be too soon or too late within those * transition states to do anything interesting. Hence, the only "valid" * state is TRANS_INPROGRESS. */ return (s->state == TRANS_INPROGRESS); } And in other cases, the function will just reference the global directly bool TransactionStartedDuringRecovery(void) { return CurrentTransactionState->startedInRecovery; } Clearly, this isn't a big deal. I'm willing to bet that the compiler looks at the variable declaration and thinks "welp, it's just going to end up in an address register anyway, may as well optimize it away" at even really low optimization levels, so I'd be surprised if there's even a difference in the machine code generated, though it might hinder inlining the function as a whole. Mostly I'm just noticing the inconsistent coding style. Good points about the alias: I can see where the alias variable acts as a reminder of the data type, but even that is the type of a pointer to the struct, so we're still 1 level of indirection away instead of 2. I can also see where it might just be an artifact of getting stuff to fit within 80 chars. Bad points about the alias: I can see where things might hinder function inlining, and might also hinder git grep searches for the global variable name (though the global in question is scoped to the file, so that's not a big concern). Is this something worth standardizing, and if so, which style do we like better?