RE: speed up a logical replica setup
Dear Euler, Again, thanks for updating the patch! There are my random comments for v9. 01. I cannot find your replies for my comments#7 [1] but you reverted related changes. I'm not sure you are still considering it or you decided not to include changes. Can you clarify your opinion? (It is needed because changes are huge so it quite affects other developments...) 02. ``` + -t seconds + --timeout=seconds ``` But source codes required `--recovery-timeout`. Please update either of them, 03. ``` + *Create a new logical replica from a standby server ``` Junwang pointed out to change here but the change was reverted [2] Can you clarify your opinion as well? 04. ``` +/* + * Is the source server ready for logical replication? If so, create the + * publications and replication slots in preparation for logical replication. + */ +static bool +setup_publisher(LogicalRepInfo *dbinfo) ``` But this function verifies the source server. I felt they should be in the different function. 05. ``` +/* + * Is the target server ready for logical replication? + */ +static bool +setup_subscriber(LogicalRepInfo *dbinfo) Actually, this function does not set up subscriber. It just verifies whether the target can become a subscriber, right? If should be renamed. 06. ``` +atexit(cleanup_objects_atexit); ``` The registration of the cleanup function is too early. This sometimes triggers a core-dump. E.g., ``` $ pg_subscriber --publisher-conninfo --subscriber-conninfo 'user=postgres port=5432' --verbose --database 'postgres' --pgdata data_N2/ pg_subscriber: error: too many command-line arguments (first is "user=postgres port=5432") pg_subscriber: hint: Try "pg_subscriber --help" for more information. Segmentation fault (core dumped) $ gdb ... (gdb) bt #0 cleanup_objects_atexit () at pg_subscriber.c:131 #1 0x7fb982cffce9 in __run_exit_handlers () from /lib64/libc.so.6 #2 0x7fb982cffd37 in exit () from /lib64/libc.so.6 #3 0x004054e6 in main (argc=9, argv=0x7ffc59074158) at pg_subscriber.c:1500 (gdb) f 3 #3 0x004054e6 in main (argc=9, argv=0x7ffc59074158) at pg_subscriber.c:1500 1500exit(1); (gdb) list 1495if (optind < argc) 1496{ 1497pg_log_error("too many command-line arguments (first is \"%s\")", 1498 argv[optind]); 1499pg_log_error_hint("Try \"%s --help\" for more information.", progname); 1500exit(1); 1501} 1502 1503/* 1504 * Required arguments ``` I still think it should be done just before the creation of objects [3]. 07. Missing a removal of publications on the standby. 08. Missing registration of LogicalRepInfo in the typedefs.list. 09 ``` + + + pg_subscriber + option + + ``` Can you reply my comment#2 [4]? I think mandatory options should be written. 10. Just to confirm - will you implement start_standby/stop_standby functions in next version? [1]: https://www.postgresql.org/message-id/TY3PR01MB9889C362FF76102C88FA1C29F56F2%40TY3PR01MB9889.jpnprd01.prod.outlook.com [2]: https://www.postgresql.org/message-id/CAEG8a3%2BwL_2R8n12BmRz7yBP3EBNdHDhmdgxQFA9vS%2BzPR%2B3Kw%40mail.gmail.com [3]: https://www.postgresql.org/message-id/TY3PR01MB9889678E47B918F4D83A6FD8F57B2%40TY3PR01MB9889.jpnprd01.prod.outlook.com [4]: https://www.postgresql.org/message-id/TY3PR01MB9889C362FF76102C88FA1C29F56F2%40TY3PR01MB9889.jpnprd01.prod.outlook.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/global/
Re: A performance issue with Memoize
On Fri, 26 Jan 2024 at 19:03, Richard Guo wrote: > At first I wondered if we should assume that the same param expr must > have the same equality operator. If not, we should also check the > operator to tell if the cache key is a duplicate, like > > - if (!list_member(*param_exprs, expr)) > + if (!list_member(*param_exprs, expr) || > + !list_member_oid(*operators, hasheqoperator)) hmm, if that were the case you wouldn't do it that way. You'd need to forboth() and look for an item in both lists matching the search. > But after looking at how rinfo->left_hasheqoperator/right_hasheqoperator > is set, it seems we can assume that: the operator is from the type cache > entry which is fetched according to the expr datatype. Yip. > So I think the patch makes sense. +1. Thanks for reviewing. I've pushed the patch. David
Apply the "LIMIT 1" optimization to partial DISTINCT
In 5543677ec9 we introduced an optimization that uses Limit instead of Unique to implement DISTINCT when all the DISTINCT pathkeys have been marked as redundant. I happened to notice that this optimization was not applied to partial DISTINCT, which I think should be. This can improve plans in some cases, such as -- on master explain (costs off) select distinct four from tenk1 where four = 4; QUERY PLAN -- Limit -> Gather Workers Planned: 4 -> Unique -> Parallel Seq Scan on tenk1 Filter: (four = 4) (6 rows) -- patched explain (costs off) select distinct four from tenk1 where four = 4; QUERY PLAN -- Limit -> Gather Workers Planned: 4 -> Limit -> Parallel Seq Scan on tenk1 Filter: (four = 4) (6 rows) Such queries might not be that common, but it's very cheap to apply this optimization. Attached is a patch for that. Thanks Richard v1-0001-Apply-the-LIMIT-1-optimization-to-partial-DISTINCT.patch Description: Binary data
Re: Remove unused fields in ReorderBufferTupleBuf
On Thu, Jan 25, 2024 at 10:17 PM Aleksander Alekseev wrote: > > Hi, > > > > Here is the corrected patch. > > > > Thank you for updating the patch! I have some comments: > > Thanks for the review. > > > -tuple = (ReorderBufferTupleBuf *) > > +tuple = (HeapTuple) > > MemoryContextAlloc(rb->tup_context, > > - > > sizeof(ReorderBufferTupleBuf) + > > + HEAPTUPLESIZE + > > MAXIMUM_ALIGNOF + > > alloc_len); > > -tuple->alloc_tuple_size = alloc_len; > > -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple); > > +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE); > > > > Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the > > MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a > > similar thing but it doesn't add it. > > Indeed. I gave it a try and nothing crashed, so it appears that > MAXIMUM_ALIGNOF is not needed. > > > --- > > xl_parameter_change *xlrec = > > -(xl_parameter_change *) > > XLogRecGetData(buf->record); > > +(xl_parameter_change *) > > XLogRecGetData(buf->record); > > > > Unnecessary change. > > That's pgindent. Fixed. > > > --- > > -/* > > - * Free a ReorderBufferTupleBuf. > > - */ > > -void > > -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf > > *tuple) > > -{ > > -pfree(tuple); > > -} > > - > > > > Why does ReorderBufferReturnTupleBuf need to be moved from > > reorderbuffer.c to reorderbuffer.h? It seems not related to this > > refactoring patch so I think we should do it in a separate patch if we > > really want it. I'm not sure it's necessary, though. > > OK, fixed. Thank you for updating the patch. It looks good to me. I'm going to push it next week, barring any objections. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Small fix on COPY ON_ERROR document
On Thursday, January 25, 2024, Yugo NAGATA wrote: > > Maybe, we can separate the sentese to two, for example: > > COPY stops operation at the first error. (The exception is if the error > is due to data type incompatibility and a value other than stop is > specified > to the ON_ERROR option.) > I’d lean more toward saying: Copy from can be instructed to ignore errors that arise from casting input data to the data types of the target columns by setting the on_error option to ignore. Skipping the entire input row in the process. The last part is because another way to ignore would be to set null values for those columns. That a command stops on an error is the assumed behavior throughout the system, writing “copy stops operation at the first error.” just seems really unnecessary. I will need to make this tweak and probably a couple others to my own suggestions in 12 hours or so. David J.
Delay Memoize hashtable build until executor run
Currently, nodeMemoize.c builds the hashtable for the cache during executor startup. This is not what is done in hash joins. I think we should make the two behave the same way. Per [1] and the corresponding discussion leading to that, making a possibly large allocation at executor startup can lead to excessively long EXPLAIN (not EXPLAIN ANALYZE) times. This can confuse users as we don't mention in EXPLAIN where the time is being spent. Although there's not yet any conclusion that Memoize is to blame, there's another report from someone confused about where this time is being spent in [2]. Working on the Memoize code, I originally created the hash table during executor startup to save on having to check we have a table each time the node is executed. However, the branch for this should be quite predictable and I doubt it'll add any overhead that we would notice. The patch to do this is attached. David [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1e731ed12aa [2] https://postgr.es/m/61e642df-5f48-4e4e-b4c3-58936f90d...@thefreecat.org diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c index 0722e4..1075178340 100644 --- a/src/backend/executor/nodeMemoize.c +++ b/src/backend/executor/nodeMemoize.c @@ -278,11 +278,14 @@ MemoizeHash_equal(struct memoize_hash *tb, const MemoizeKey *key1, } /* - * Initialize the hash table to empty. + * Initialize the hash table to empty. The MemoizeState's hashtable field + * must point to NULL. */ static void build_hash_table(MemoizeState *mstate, uint32 size) { + Assert(mstate->hashtable == NULL); + /* Make a guess at a good size when we're not given a valid size. */ if (size == 0) size = 1024; @@ -400,9 +403,12 @@ remove_cache_entry(MemoizeState *mstate, MemoizeEntry *entry) static void cache_purge_all(MemoizeState *mstate) { - uint64 evictions = mstate->hashtable->members; + uint64 evictions = 0; PlanState *pstate = (PlanState *) mstate; + if (mstate->hashtable != NULL) + evictions = mstate->hashtable->members; + /* * Likely the most efficient way to remove all items is to just reset the * memory context for the cache and then rebuild a fresh hash table. This @@ -410,8 +416,8 @@ cache_purge_all(MemoizeState *mstate) */ MemoryContextReset(mstate->tableContext); - /* Make the hash table the same size as the original size */ - build_hash_table(mstate, ((Memoize *) pstate->plan)->est_entries); + /* NULLify so we recreate the table on the next call */ + mstate->hashtable = NULL; /* reset the LRU list */ dlist_init(>lru_list); @@ -707,6 +713,10 @@ ExecMemoize(PlanState *pstate) Assert(node->entry == NULL); + /* first call? we'll need a hash table. */ + if (unlikely(node->hashtable == NULL)) + build_hash_table(node, ((Memoize *) pstate->plan)->est_entries); + /* * We're only ever in this state for the first call of the * scan. Here we have a look to see if we've already seen the @@ -1051,8 +1061,11 @@ ExecInitMemoize(Memoize *node, EState *estate, int eflags) /* Zero the statistics counters */ memset(>stats, 0, sizeof(MemoizeInstrumentation)); - /* Allocate and set up the actual cache */ - build_hash_table(mstate, node->est_entries); + /* +* Because it may require a large allocation we delay building of the hash +* table until executor run. +*/ + mstate->hashtable = NULL; return mstate; } @@ -1062,6 +1075,7 @@ ExecEndMemoize(MemoizeState *node) { #ifdef USE_ASSERT_CHECKING /* Validate the memory accounting code is correct in assert builds. */ + if (node->hashtable != NULL) { int count; uint64 mem = 0;
Re: Small fix on COPY ON_ERROR document
On Fri, 26 Jan 2024 15:26:55 +0900 Masahiko Sawada wrote: > On Fri, Jan 26, 2024 at 2:40 PM Yugo NAGATA wrote: > > > > On Fri, 26 Jan 2024 13:59:09 +0900 > > Masahiko Sawada wrote: > > > > > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA wrote: > > > > > > > > Hi, > > > > > > > > I found that the documentation of COPY ON_ERROR said > > > > COPY stops operation at the first error when > > > > "ON_ERROR is not specified.", but it also stop when > > > > ON_ERROR is specified to the default value, "stop". > > > > > > > > I attached a very small patch to fix this just for > > > > making the description more accurate. > > > > > > Thank you for the patch! > > > > > > +1 to fix it. > > > > > > -ON_ERROR is not specified. This > > > -should not lead to problems in the event of a COPY > > > +ON_ERROR is not specified or > > > stop. > > > +This should not lead to problems in the event of a COPY > > > > > > How about the followings for consistency with the description of the > > > ON_ERROR option? > > > > > > COPY stops operation at the first error if the stop value is specified > > > to the ON_ERROR option. This should not lead to ... > > > > Thank you for you review. However, after posting the previous patch, > > I noticed that I overlooked ON_ERROR works only for errors due to data > > type incompatibility (that is mentioned as "malformed data" in the > > ON_ERROR description, though). > > Right. > > > > > So, how about rewriting this like: > > > > COPY stops operation at the first error unless the error is due to data > > type incompatibility and a value other than stop is specified to the > > ON_ERROR option. > > Hmm, this sentence seems not very readable to me, especially the "COPY > stops ... unless ... a value other than stop is specified ..." part. I > think we can simplify it: I can agree with your opinion on the readability, but... > COPY stops operation at the first data type incompatibility error if > the stop value is specified to the ON_ERROR option. This statement doesn't explain COPY also stops when an error other than data type incompatibility (e.g. constrain violations) occurs. Maybe, we can separate the sentese to two, for example: COPY stops operation at the first error. (The exception is if the error is due to data type incompatibility and a value other than stop is specified to the ON_ERROR option.) Regards, Yugo Nagata > Regards, > > -- > Masahiko Sawada > Amazon Web Services: https://aws.amazon.com -- Yugo NAGATA
Re: Transaction timeout
> On 26 Jan 2024, at 11:44, Andrey M. Borodin wrote: > > > 1. It’s unsafe for isaoltion tester to await transaction_timeout within a > query. Usually it gets > FATAL: terminating connection due to transaction timeout > But if VM is a bit slow it can get occasional > PQconsumeInput failed: server closed the connection unexpectedly > So, currently all tests use “passive waiting”, in a session that will not > timeout. Oops, sorry, I’ve accidentally sent version without this fix. Here it is. Best regards, Andrey Borodin. v24-0001-Introduce-transaction_timeout.patch Description: Binary data
Re: Transaction timeout
> On 22 Jan 2024, at 11:23, Peter Smith wrote: > > Hi, This patch has a CF status of "Needs Review" [1], but it seems > there was a CFbot test failure last time it was run [2]. Please have a > look and post an updated version if necessary. Thanks Peter! I’ve inspected CI fails and they were caused by two different problems: 1. It’s unsafe for isaoltion tester to await transaction_timeout within a query. Usually it gets FATAL: terminating connection due to transaction timeout But if VM is a bit slow it can get occasional PQconsumeInput failed: server closed the connection unexpectedly So, currently all tests use “passive waiting”, in a session that will not timeout. 2. In some cases pg_sleep(0.1) were sleeping up to 200 ms. That was making s7 and s8 fail, because they rely on this margin. I’ve separated these tests into different test timeouts-long and increased margin to 300ms. Now tests run horrible 2431 ms. Moreover I’m afraid that on buildfarm we can have much randomly-slower machines so this test might be excluded. This test checks COMMIT AND CHAIN and flow of small queries (Nik’s case). Also I’ve verified that every "enable_timeout_after(TRANSACTION_TIMEOUT)” and “disable_timeout(TRANSACTION_TIMEOUT)” is necessary and found that case of aborting "idle in transaction (aborted)” is not covered by tests. I’m not sure we need a test for this. Japin, Junwang, what do you think? Thanks! Best regards, Andrey Borodin. v23-0001-Introduce-transaction_timeout.patch Description: Binary data
Re: Popcount optimization using AVX512
On 2024-Jan-25, Alvaro Herrera wrote: > Finally, the matter of using ifunc as proposed by Noah seems to be still > in the air, with no patches offered for the popcount family. Oh, I just realized that the patch as currently proposed is placing the optimized popcount code in the path that does not require going through a function pointer. So the performance increase is probably coming from both avoiding jumping through the pointer as well as from the improved instruction. This suggests that finding a way to make the ifunc stuff work (with good performance) is critical to this work. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "The ability of users to misuse tools is, of course, legendary" (David Steele) https://postgr.es/m/11b38a96-6ded-4668-b772-40f992132...@pgmasters.net
Re: Small fix on COPY ON_ERROR document
On Thu, Jan 25, 2024 at 10:40 PM Yugo NAGATA wrote: > On Fri, 26 Jan 2024 13:59:09 +0900 > Masahiko Sawada wrote: > > > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA > wrote: > > > > > > Hi, > > > > > > I found that the documentation of COPY ON_ERROR said > > > COPY stops operation at the first error when > > > "ON_ERROR is not specified.", but it also stop when > > > ON_ERROR is specified to the default value, "stop". > > > > > I'm finding the current wording surrounding ON_ERROR to not fit in with the rest of the page. I've tried to improve things by being a bit more invasive. This first hunk updates the description of the COPY command to include describing the purpose of the ON_ERROR clause. I too am concerned about the word "parsed" here, and "malformed" in the more detailed descriptions; this would need to be modified to better reflect the circumstances under which ignore happens. diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 21a5c4a052..5fe4c9f747 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -90,6 +90,14 @@ COPY { table_name [ ( pg_stat_progress_copy view. See for details. + + +By default, COPY will abort if it encounters errors. +For non-binary format COPY FROM the user can specify +to instead ignore input rows that cannot be parsed by specifying +the ignore option to the ON_ERROR +clause. + The use of the word "Currently" stood out to me when reading the next hunk. We always document current reality and don't call out that fact. We do not imply some future feature may change how this all works. On a related note, right now we have stop and ignore, which are basically enums just like we have for format (csv, text, binary). Unlike format, though, this option requires single quotes. Why is that? I did keep with not specifying the enum in the main syntax block since format doesn't as well; though writing { stop | ignore } seemed quite appealing. The rest of the page indicates what the default is in a sentence by itself, not as a parenthetical next to the option. The command limitations seemed worthy of a separate paragraph, though it repeats the description which isn't great. I'm going to sleep on this one. @@ -379,17 +387,16 @@ COPY { table_name [ ( Specifies which - error_action to perform when there is malformed data in the input. - Currently, only stop (default) and ignore - values are supported. - If the stop value is specified, - COPY stops operation at the first error. - If the ignore value is specified, - COPY skips malformed data and continues copying data. - The option is allowed only in COPY FROM. - Only stop value is allowed when - using binary format. + error_action to perform when encountering a malformed input row: + stop or ignore. + A value of stop means fail the command, while + ignore means discard the input row and continue with the next one. + The default is stop + + The ignore option is applicable only for COPY FROM + when the FORMAT is text + or csv. + The following was, IMO, too much text about something not to worry about, COPY TO and ignore mode. The point is dead tuples on error and running vacuum. It doesn't really even matter what caused the error, though if the error was even before rows started to be imported then obviously there would be no dead tuples. Oh, and the word "first" really isn't adding anything here that we cannot reasonably leave to common sense, IMO. We either ignore all errors or stop on the first one. There isn't a stop mode that is capable of bypassing errors and the ignore mode doesn't have a "first n" option so one assumes all errors are ignored. @@ -576,15 +583,12 @@ COPY count -COPY stops operation at the first error when -ON_ERROR is not specified. This -should not lead to problems in the event of a COPY -TO, but the target table will already have received -earlier rows in a COPY FROM. These rows will not -be visible or accessible, but they still occupy disk space. This might -amount to a considerable amount of wasted disk space if the failure -happened well into a large copy operation. You might wish to invoke -VACUUM to recover the wasted space. +A failed COPY FROM command will have performed +physical insertion of all rows prior to the malformed one. +While these rows will not be visible or accessible, they still occupy +disk space. This might amount to a considerable amount of wasted disk +space if the failure happened well into a large copy operation. +VACUUM should be used to recover the wasted space. David J.
Re: make dist using git archive
On 25.01.24 17:25, Tristan Partin wrote: The way that this currently works is that you will fail at configure time if bz2 doesn't exist on the system. Meson will try to resolve a .path() method on a NotFoundProgram. You might want to define the bz2 target to just call `exit 1` in this case. if bzip2.found() # do your current target else bz2 = run_target('tar.bz2', command: ['exit', 1]) endif This should cause pgdist to appropriately fail at run time when generating the bz2 tarball. Ok, done that way. For what it's worth, I run Meson 1.3, and the behavior of generating the tarballs even though it is a dirty tree still occurred. In the new patch you seem to say it was fixed in 0.60. The problem I'm referring to is that before 0.60, alias_target cannot depend on run_target (only "build target"). This is AFAICT not documented and might not have been an intentional change, but you can trace it in the meson source code, and it shows in the PostgreSQL CI. That's also why for the above bzip2 issue I have to use custom_target in place of your run_target. From 48ddd079c1530baaabf92a5650ff9a7bfa7ac3d6 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Thu, 25 Jan 2024 12:28:28 +0100 Subject: [PATCH v2 1/2] make dist uses git archive Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org --- GNUmakefile.in | 34 -- meson.build| 47 +++ 2 files changed, 59 insertions(+), 22 deletions(-) diff --git a/GNUmakefile.in b/GNUmakefile.in index eba569e930e..680c765dd73 100644 --- a/GNUmakefile.in +++ b/GNUmakefile.in @@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport distdir= postgresql-$(VERSION) dummy = =install= +GIT = git + dist: $(distdir).tar.gz $(distdir).tar.bz2 - rm -rf $(distdir) - -$(distdir).tar: distdir - $(TAR) chf $@ $(distdir) - -.INTERMEDIATE: $(distdir).tar - -distdir-location: - @echo $(distdir) - -distdir: - rm -rf $(distdir)* $(dummy) - for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \ - file=`expr X$$x : 'X\./\(.*\)'`; \ - if test -d "$(top_srcdir)/$$file" ; then \ - mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \ - else \ - ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \ - || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \ - fi || exit; \ - done - $(MAKE) -C $(distdir) distclean + +.PHONY: check-dirty-index +check-dirty-index: + $(GIT) -C $(srcdir) diff-index --quiet HEAD + +$(distdir).tar.gz: check-dirty-index + $(GIT) -C $(srcdir) archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@ + +$(distdir).tar.bz2: check-dirty-index + $(GIT) -C $(srcdir) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@ distcheck: dist rm -rf $(dummy) diff --git a/meson.build b/meson.build index 55184db2488..c245a1818e0 100644 --- a/meson.build +++ b/meson.build @@ -3348,6 +3348,53 @@ run_target('help', +### +# Distribution archive +### + +git = find_program('git', required: false, native: true) +bzip2 = find_program('bzip2', required: false, native: true) + +distdir = meson.project_name() + '-' + meson.project_version() + +check_dirty_index = run_target('check-dirty-index', + command: [git, '-C', '@SOURCE_ROOT@', 'diff-index', '--quiet', 'HEAD']) + +tar_gz = custom_target('tar.gz', + build_always_stale: true, + command: [git, '-C', '@SOURCE_ROOT@', 'archive', +'--format', 'tar.gz', +'--prefix', distdir + '/', +'-o', join_paths(meson.build_root(), '@OUTPUT@'), +'HEAD', '.'], + install: false, + output: distdir + '.tar.gz', +) + +if bzip2.found() + tar_bz2 = custom_target('tar.bz2', +build_always_stale: true, +command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c', 'archive', + '--format', 'tar.bz2', + '--prefix', distdir + '/', + '-o', join_paths(meson.build_root(), '@OUTPUT@'), + 'HEAD', '.'], +install: false, +output: distdir + '.tar.bz2', + ) +else + tar_bz2 = custom_target('tar.bz2', +command: ['sh', '-c', 'exit 1'], +output: distdir + '.tar.bz2', + ) +endif + +# FIXME: would like to add check_dirty_index here, broken(?) before +# meson 0.60.0 +alias_target('pgdist', [tar_gz, tar_bz2]) + + + ### # The End, The End, My Friend ### base-commit: 46d8587b504170ca14f064890bc7ccbbd7523f81 -- 2.43.0 From
Re: Small fix on COPY ON_ERROR document
On Fri, Jan 26, 2024 at 2:40 PM Yugo NAGATA wrote: > > On Fri, 26 Jan 2024 13:59:09 +0900 > Masahiko Sawada wrote: > > > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA wrote: > > > > > > Hi, > > > > > > I found that the documentation of COPY ON_ERROR said > > > COPY stops operation at the first error when > > > "ON_ERROR is not specified.", but it also stop when > > > ON_ERROR is specified to the default value, "stop". > > > > > > I attached a very small patch to fix this just for > > > making the description more accurate. > > > > Thank you for the patch! > > > > +1 to fix it. > > > > -ON_ERROR is not specified. This > > -should not lead to problems in the event of a COPY > > +ON_ERROR is not specified or > > stop. > > +This should not lead to problems in the event of a COPY > > > > How about the followings for consistency with the description of the > > ON_ERROR option? > > > > COPY stops operation at the first error if the stop value is specified > > to the ON_ERROR option. This should not lead to ... > > Thank you for you review. However, after posting the previous patch, > I noticed that I overlooked ON_ERROR works only for errors due to data > type incompatibility (that is mentioned as "malformed data" in the > ON_ERROR description, though). Right. > > So, how about rewriting this like: > > COPY stops operation at the first error unless the error is due to data > type incompatibility and a value other than stop is specified to the > ON_ERROR option. Hmm, this sentence seems not very readable to me, especially the "COPY stops ... unless ... a value other than stop is specified ..." part. I think we can simplify it: COPY stops operation at the first data type incompatibility error if the stop value is specified to the ON_ERROR option. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: A performance issue with Memoize
On Fri, Jan 26, 2024 at 12:18 PM David Rowley wrote: > On Fri, 26 Jan 2024 at 16:51, Tom Lane wrote: > > >> However ... it seems like we're not out of the woods yet. Why > > >> is Richard's proposed test case still showing > > >> + -> Memoize (actual rows=5000 loops=N) > > >> + Cache Key: t1.two, t1.two > > >> Seems like there is missing de-duplication logic, or something. > > > > > This seems separate and isn't quite causing the same problems as what > > > Richard wants to fix so I didn't touch this for now. > > > > Fair enough, but I think it might be worth pursuing later. > > Here's a patch for that. At first I wondered if we should assume that the same param expr must have the same equality operator. If not, we should also check the operator to tell if the cache key is a duplicate, like - if (!list_member(*param_exprs, expr)) + if (!list_member(*param_exprs, expr) || + !list_member_oid(*operators, hasheqoperator)) But after looking at how rinfo->left_hasheqoperator/right_hasheqoperator is set, it seems we can assume that: the operator is from the type cache entry which is fetched according to the expr datatype. So I think the patch makes sense. +1. Thanks Richard
Race condition in FetchTableStates() breaks synchronization of subscription tables
Hello hackers, After determining a possible cause for intermittent failures of the test subscription/031_column_list [1], I was wondering what makes another subscription test (014_binary) fail on the buildfarm: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snakefly=2024-01-22%2001%3A19%3A03 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2024-01-14%2018%3A19%3A20 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet=2023-12-21%2001%3A11%3A52 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2023-11-27%2001%3A42%3A39 All those failures caused by a timeout when waiting for a message expected in _subscriber.log. For example, in the snakefly's case: [01:14:46.158](1.937s) ok 7 - check synced data on subscriber with custom type timed out waiting for match: (?^:ERROR: ( [A-Z0-9]+:)? incorrect binary data format) at /home/bf/bf-build/piculet/HEAD/pgsql/src/test/subscription/t/014_binary.pl line 269. _subscriber.log contains: 2023-12-21 01:14:46.215 UTC [410039] 014_binary.pl LOG: statement: ALTER SUBSCRIPTION tsub REFRESH PUBLICATION; 2023-12-21 01:17:46.756 UTC [40] ERROR: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2023-12-21 01:17:46.760 UTC [405057] LOG: background worker "logical replication apply worker" (PID 40) exited with exit code 1 2023-12-21 01:17:46.779 UTC [532857] LOG: logical replication apply worker for subscription "tsub" has started ... While _subscriber.log from a successful test run contains: 2024-01-26 03:49:07.065 UTC [9726:5] 014_binary.pl LOG: statement: ALTER SUBSCRIPTION tsub REFRESH PUBLICATION; 2024-01-26 03:49:07.075 UTC [9726:6] 014_binary.pl LOG: disconnection: session time: 0:00:00.014 user=postgres database=postgres host=[local] 2024-01-26 03:49:07.558 UTC [9729:1] LOG: logical replication apply worker for subscription "tsub" has started 2024-01-26 03:49:07.563 UTC [9731:1] LOG: logical replication table synchronization worker for subscription "tsub", table "test_mismatching_types" has started 2024-01-26 03:49:07.585 UTC [9731:2] ERROR: incorrect binary data format 2024-01-26 03:49:07.585 UTC [9731:3] CONTEXT: COPY test_mismatching_types, line 1, column a In this case, "logical replication apply worker for subscription "tsub" has started" appears just after "ALTER SUBSCRIPTION", not 3 minutes later. I've managed to reproduce this failure locally by running multiple tests in parallel, and my analysis shows that it is caused by a race condition when accessing variable table_states_valid inside tablesync.c. tablesync.c does the following with table_states_valid: /* * Callback from syscache invalidation. */ void invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue) { table_states_valid = false; } ... static bool FetchTableStates(bool *started_tx) { ... if (!table_states_valid) { ... /* Fetch all non-ready tables. */ rstates = GetSubscriptionRelations(MySubscription->oid, true); ... table_states_valid = true; } So, when syscache invalidation occurs inside the code block "if (!table_states_valid)", that invalidation is effectively ignored. In a failed case I observe the following events: 1. logical replication apply worker performs LogicalRepApplyLoop() -> process_syncing_tables() -> process_syncing_tables_for_apply() -> FetchTableStates() periodically. 2. ALTER SUBSCRIPTION tsub REFRESH PUBLICATION invalidates syscache for SUBSCRIPTIONRELMAP, and that leads to calling invalidate_syncing_table_states(). 3. If the apply worker manages to fetch no non-ready tables in FetchTableStates() and ignore "table_states_valid = false" from invalidate_syncing_table_states(), then it just misses the invalidation event, so it keeps working without noticing non-ready tables appeared as the result of ALTER SUBSCRIPTION (process_syncing_tables_for_apply() skips a loop "foreach(lc, table_states_not_ready) ..." until some other event occurs). pg_usleep(10) added just below GetSubscriptionRelations(...) proves my analysis -- without it, I need tens of iterations (with 50 tests running in parallel) to catch the failure, but with it, I get the failure on the first iteration. (Reproduced on REL_16_STABLE..master, where the test 014_binary tries to "Refresh the publication to trigger the tablesync", but I think the race condition exists in the previous versions as well.) [1] https://www.postgresql.org/message-id/16d6d9cc-f97d-0b34-be65-425183ed3...@gmail.com Best regards, Alexander
RE: speed up a logical replica setup
Dear Euler, Thanks for updating the patch! Before reading yours, I wanted to reply some of comments. > I'm still thinking about replacing --subscriber-conninfo with separate items (username, port, password?, host = socket dir). Maybe it is an overengineering. The user can always prepare the environment to avoid unwanted and/or external connections. > For me, required amount of fixes are not so different from current one. How about others? > It is extremely useful because (a) you have a physical replication setup and don't know if it is prepared for logical replication, (b) check GUCs (is max_wal_senders sufficient for this pg_subscriber command? Or is max_replication_slots sufficient to setup the logical replication even though I already have some used replication slots?), (c) connectivity and (d) credentials. > Yeah, it is useful for verification purpose, so let's keep this option. But I still think the naming should be "--check". Also, there are many `if (!dry_run)` but most of them can be removed if the process exits earlier. Thought? > > 05. > I felt we should accept some settings from enviroment variables, like > pg_upgrade. > Currently, below items should be acceted. > > - data directory > - username > - port > - timeout Maybe PGDATA. > Sorry, I cannot follow this. Did you mean that the target data directory should be able to be specified by PGDATA? OF so, +1. > > 06. > ``` > pg_logging_set_level(PG_LOG_WARNING); > ``` > > If the default log level is warning, there are no ways to output debug logs. > (-v option only raises one, so INFO would be output) > I think it should be PG_LOG_INFO. You need to specify multiple -v options. > Hmm. I felt the specification was bit strange...but at least it must be described on the documentation. pg_dump.sgml has similar lines. > > 08. > Not sure, but if we want to record outputs by pg_subscriber, the sub-directory > should be created. The name should contain the timestamp. The log file already contains the timestamp. Why? > This comment assumed outputs by pg_subscriber were also recorded to a file. In this case and if the file also has the same timestamp, I think they can be gathered in the same place. No need if outputs are not recorded. > > 09. > Not sure, but should we check max_slot_wal_keep_size of primary server? It can > avoid to fail starting of logical replicaiton. A broken physical replication *before* running this tool is its responsibility? Hmm. We might add another check that can be noticed during dry run mode. > I thought that we should not generate any broken objects, but indeed, not sure it is our scope. How do other think? > > 11. > ``` > /* > * Stop the subscriber if it is a standby server. Before executing the > * transformation steps, make sure the subscriber is not running because > * one of the steps is to modify some recovery parameters that require a > * restart. > */ > if (stat(pidfile, ) == 0) > ``` > > I kept just in case, but I'm not sure it is still needed. How do you think? > Removing it can reduce an inclusion of pidfile.h. Are you suggesting another way to check if the standby is up and running? > Running `pg_ctl stop` itself can detect whether the process has been still alive. It would exit with 1 when the process is not there. > > I didn't notice the comment, but still not sure the reason. Why we must > reserve > the slot until pg_subscriber finishes? IIUC, the slot would be never used, it > is created only for getting a consistent_lsn. So we do not have to keep. > Also, just before, logical replication slots for each databases are created, > so > WAL records are surely reserved. > I want to confirm the conclusion - will you remove the creation of a transient slot? Also, not tested, I'm now considering that we can reuse the primary_conninfo value. We are assuming that the target server is standby and the current upstream one will convert to publisher. In this case, the connection string is already specified as primary_conninfo so --publisher-conninfo may not be needed. The parameter does not contain database name, so --databases is still needed. I imagine like: 1. Parse options 2. Turn on standby 3. Verify the standby 4. Turn off standby 5. Get primary_conninfo from standby 6. Connect to primary (concat of primary_conninfo and an option is used) 7. Verify the primary ... How do you think? Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/global/
Re: Small fix on COPY ON_ERROR document
On Fri, 26 Jan 2024 13:59:09 +0900 Masahiko Sawada wrote: > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA wrote: > > > > Hi, > > > > I found that the documentation of COPY ON_ERROR said > > COPY stops operation at the first error when > > "ON_ERROR is not specified.", but it also stop when > > ON_ERROR is specified to the default value, "stop". > > > > I attached a very small patch to fix this just for > > making the description more accurate. > > Thank you for the patch! > > +1 to fix it. > > -ON_ERROR is not specified. This > -should not lead to problems in the event of a COPY > +ON_ERROR is not specified or stop. > +This should not lead to problems in the event of a COPY > > How about the followings for consistency with the description of the > ON_ERROR option? > > COPY stops operation at the first error if the stop value is specified > to the ON_ERROR option. This should not lead to ... Thank you for you review. However, after posting the previous patch, I noticed that I overlooked ON_ERROR works only for errors due to data type incompatibility (that is mentioned as "malformed data" in the ON_ERROR description, though). So, how about rewriting this like: COPY stops operation at the first error unless the error is due to data type incompatibility and a value other than stop is specified to the ON_ERROR option. I attached the updated patch. Regards, Yugo Nagata > > Regards, > > -- > Masahiko Sawada > Amazon Web Services: https://aws.amazon.com -- Yugo NAGATA diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 21a5c4a052..3676bf0e87 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -576,9 +576,10 @@ COPY count -COPY stops operation at the first error when -ON_ERROR is not specified. This -should not lead to problems in the event of a COPY +COPY stops operation at the first error unless the error +is due to data type incompatibility and a value other than +stop is specified to the ON_ERROR option. +This should not lead to problems in the event of a COPY TO, but the target table will already have received earlier rows in a COPY FROM. These rows will not be visible or accessible, but they still occupy disk space. This might
Re: A performance issue with Memoize
On Fri, Jan 26, 2024 at 1:22 AM Tom Lane wrote: > Apologies for not having noticed this thread before. I'm taking > a look at it now. However, while sniffing around this I found > what seems like an oversight in paramassign.c's > assign_param_for_var(): it says it should compare all the same > fields as _equalVar except for varlevelsup, but it's failing to > compare varnullingrels. Is that a bug? It's conceivable that > it's not possible to get here with varnullingrels different and > all else the same, but I don't feel good about that proposition. > > I tried adding > > @@ -91,7 +91,10 @@ assign_param_for_var(PlannerInfo *root, Var *var) > pvar->vartype == var->vartype && > pvar->vartypmod == var->vartypmod && > pvar->varcollid == var->varcollid) > +{ > +Assert(bms_equal(pvar->varnullingrels, > var->varnullingrels)); > return pitem->paramId; > +} > } > } Yeah, I think it should be safe to assert that the varnullingrels is equal here. The Var is supposed to be an upper-level Var, and two same such Vars should not have different varnullingrels at this point, although the varnullingrels might be adjusted later in identify_current_nestloop_params according to which form of identity 3 we end up applying. Thanks Richard
Re: Small fix on COPY ON_ERROR document
On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA wrote: > > Hi, > > I found that the documentation of COPY ON_ERROR said > COPY stops operation at the first error when > "ON_ERROR is not specified.", but it also stop when > ON_ERROR is specified to the default value, "stop". > > I attached a very small patch to fix this just for > making the description more accurate. Thank you for the patch! +1 to fix it. -ON_ERROR is not specified. This -should not lead to problems in the event of a COPY +ON_ERROR is not specified or stop. +This should not lead to problems in the event of a COPY How about the followings for consistency with the description of the ON_ERROR option? COPY stops operation at the first error if the stop value is specified to the ON_ERROR option. This should not lead to ... Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Remove unused fields in ReorderBufferTupleBuf
On Fri, Jan 26, 2024 at 1:24 PM wrote: > > On Thu, 2024-01-25 at 16:11 -0500, reid.thomp...@crunchydata.com wrote: > > > > I walked through v6 and didn't note any issues. Thank you for reviewing the patch! > > > > I do want to ask, the patch alters ReorderBufferReturnTupleBuf() and > > drops the unused parameter ReorderBuffer *rb. It seems that > > ReorderBufferFreeSnap(), ReorderBufferReturnRelids(), > > ReorderBufferImmediateInvalidation(), and ReorderBufferRestoreCleanup() > > also pass ReorderBuffer *rb but do not use it. Would it be beneficial > > to implement a separate patch to remove this parameter from these > > functions also? > > > > > > It also appears that ReorderBufferSerializeChange() has 5 instances > where it increments the local variables "data" but then they're never > read. > Lines 3806, 3836, 3854, 3889, 3910 > > I can create patch and post it to this thread or a new one if deemed > worthwhile. I'm not sure these changes are really beneficial. They contribute to improving neither readability and performance IMO. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Remove unused fields in ReorderBufferTupleBuf
On Thu, 2024-01-25 at 16:11 -0500, reid.thomp...@crunchydata.com wrote: > > I walked through v6 and didn't note any issues. > > I do want to ask, the patch alters ReorderBufferReturnTupleBuf() and > drops the unused parameter ReorderBuffer *rb. It seems that > ReorderBufferFreeSnap(), ReorderBufferReturnRelids(), > ReorderBufferImmediateInvalidation(), and ReorderBufferRestoreCleanup() > also pass ReorderBuffer *rb but do not use it. Would it be beneficial > to implement a separate patch to remove this parameter from these > functions also? > > Thanks, > Reid > It also appears that ReorderBufferSerializeChange() has 5 instances where it increments the local variables "data" but then they're never read. Lines 3806, 3836, 3854, 3889, 3910 I can create patch and post it to this thread or a new one if deemed worthwhile. Thanks, Reid
Re: A performance issue with Memoize
On Fri, 26 Jan 2024 at 16:51, Tom Lane wrote: > >> However ... it seems like we're not out of the woods yet. Why > >> is Richard's proposed test case still showing > >> + -> Memoize (actual rows=5000 loops=N) > >> + Cache Key: t1.two, t1.two > >> Seems like there is missing de-duplication logic, or something. > > > This seems separate and isn't quite causing the same problems as what > > Richard wants to fix so I didn't touch this for now. > > Fair enough, but I think it might be worth pursuing later. Here's a patch for that. David diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index c0ba087b40..8ecf9fe8dc 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -492,8 +492,16 @@ paraminfo_get_equal_hashops(PlannerInfo *root, ParamPathInfo *param_info, return false; } - *operators = lappend_oid(*operators, hasheqoperator); - *param_exprs = lappend(*param_exprs, expr); + /* +* 'expr' may already exist as a parameter from a previous ppi_clauses. No +* need to include it again, however we'd better ensure we do switch into +* binary mode if required. See below. +*/ + if (!list_member(*param_exprs, expr)) + { + *operators = lappend_oid(*operators, hasheqoperator); + *param_exprs = lappend(*param_exprs, expr); + } /* * When the join operator is not hashable then it's possible that @@ -536,8 +544,16 @@ paraminfo_get_equal_hashops(PlannerInfo *root, ParamPathInfo *param_info, return false; } - *operators = lappend_oid(*operators, typentry->eq_opr); - *param_exprs = lappend(*param_exprs, expr); + /* +* 'expr' may already exist as a parameter from the ppi_clauses. No need to +* include it again, however we'd better ensure we do switch into binary +* mode. +*/ + if (!list_member(*param_exprs, expr)) + { + *operators = lappend_oid(*operators, typentry->eq_opr); + *param_exprs = lappend(*param_exprs, expr); + } /* * We must go into binary mode as we don't have too much of an idea of diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out index ca198ec3b8..17bb3c8661 100644 --- a/src/test/regress/expected/memoize.out +++ b/src/test/regress/expected/memoize.out @@ -107,7 +107,7 @@ WHERE t1.unique1 < 10;', false); -> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 loops=N) Index Cond: (unique1 < 10) -> Memoize (actual rows=2 loops=N) - Cache Key: t1.two, t1.two + Cache Key: t1.two Cache Mode: binary Hits: 8 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB -> Subquery Scan on t2 (actual rows=2 loops=N)
Re: A performance issue with Memoize
David Rowley writes: > I've adjusted the comments to what you mentioned and also leaned out > the pretty expensive test case to something that'll run much faster > and pushed the result. +1, I was wondering if the test could be cheaper. It wasn't horrid as Richard had it, but core regression tests add up over time. >> However ... it seems like we're not out of the woods yet. Why >> is Richard's proposed test case still showing >> + -> Memoize (actual rows=5000 loops=N) >> + Cache Key: t1.two, t1.two >> Seems like there is missing de-duplication logic, or something. > This seems separate and isn't quite causing the same problems as what > Richard wants to fix so I didn't touch this for now. Fair enough, but I think it might be worth pursuing later. regards, tom lane
Re: A performance issue with Memoize
On Fri, 26 Jan 2024 at 07:32, Tom Lane wrote: > > David Rowley writes: > > I'd feel better about doing it your way if Tom could comment on if > > there was a reason he put the function calls that way around in > > 5ebaaa494. > > I'm fairly sure I thought it wouldn't matter because of the Param > de-duplication done in paramassign.c. However, Richard's example > shows that's not so, because process_subquery_nestloop_params is > picky about the param ID assigned to a particular Var while > replace_nestloop_params is not. So flipping the order makes sense. Makes sense. I've adjusted the comments to what you mentioned and also leaned out the pretty expensive test case to something that'll run much faster and pushed the result. > However ... it seems like we're not out of the woods yet. Why > is Richard's proposed test case still showing > > + -> Memoize (actual rows=5000 loops=N) > + Cache Key: t1.two, t1.two > > Seems like there is missing de-duplication logic, or something. This seems separate and isn't quite causing the same problems as what Richard wants to fix so I didn't touch this for now. David
Re: Use of backup_label not noted in log
On Thu, Jan 25, 2024 at 05:44:52PM -0400, David Steele wrote: > On 1/25/24 17:42, Tom Lane wrote: >> We're talking about 1d35f705e, right? That certainly looks harmless >> and potentially useful. I'm +1 for back-patching. > > That's the one. If we were modifying existing messages I would be against > it, but new, infrequent (but oh so helpful) messages seem fine. Well, I'm OK with this consensus on 1d35f705e if folks think this is useful enough for all the stable branches. -- Michael signature.asc Description: PGP signature
Re: Improve WALRead() to suck data directly from WAL buffers when possible
On Thu, 2024-01-25 at 14:35 +0530, Bharath Rupireddy wrote: > > "expecting zeros at the end" - this can't always be true as the WAL > ... > I think this needs to be discussed separately. If okay, I'll start a > new thread. Thank you for investigating. When the above issue is handled, I'll be more comfortable expanding the call sites for XLogReadFromBuffers(). > > Also, if we've detected that the first requested buffer has been > > evicted, is there any value in continuing the loop to see if more > > recent buffers are available? For example, if the requested LSNs > > range > > over buffers 4, 5, and 6, and 4 has already been evicted, should we > > try > > to return LSN data from 5 and 6 at the proper offset in the dest > > buffer? If so, we'd need to adjust the API so the caller knows what > > parts of the dest buffer were filled in. > > I'd second this capability for now to keep the API simple and clear, > but we can consider expanding it as needed. Agreed. This case doesn't seem important; I just thought I'd ask about it. > If the callers use GetFlushRecPtr() to determine how far to read, > LogwrtResult will be *reasonably* latest It will be up-to-date enough that we'd never go through WaitXLogInsertionsToFinish(), which is all we care about. > As far as the current WAL readers are concerned, we don't need an > explicit spinlock to determine LogwrtResult because all of them use > GetFlushRecPtr() to determine how far to read. If there's any caller > that's not updating LogwrtResult at all, we can consider reading > LogwrtResult it ourselves in future. So we don't actually need that path yet, right? > 5. I think the two requirements specified at > https://www.postgresql.org/message-id/20231109205836.zjoawdrn4q77yemv%40awork3.anarazel.de > still hold with the v19j. Agreed. > PSA v20 patch set. 0001 is very close. I have the following suggestions: * Don't just return zero. If the caller is doing something we don't expect, we want to fix the caller. I understand you'd like this to be more like a transparent optimization, and we may do that later, but I don't think it's a good idea to do that now. * There's currently no use for reading LSNs between Write and Insert, so remove the WaitXLogInsertionsToFinish() code path. That also means we don't need the extra emitLog parameter, so we can remove that. When we have a use case, we can bring it all back. If you agree, I can just make those adjustments (and do some final checking) and commit 0001. Otherwise let me know what you think. 0002: How does the test control whether the data requested is before the Flush pointer, the Write pointer, or the Insert pointer? What if the walwriter comes in and moves one of those pointers before the next statement is executed? Also, do you think a test module is required for the basic functionality in 0001, or only when we start doing more complex things like reading past the Flush pointer? 0003: can you explain why this is useful for wal summarizer to read from the buffers? Regards, Jeff Davis
Re: A performance issue with Memoize
On Fri, Jan 26, 2024 at 2:32 AM Tom Lane wrote: > I'm fairly sure I thought it wouldn't matter because of the Param > de-duplication done in paramassign.c. However, Richard's example > shows that's not so, because process_subquery_nestloop_params is > picky about the param ID assigned to a particular Var while > replace_nestloop_params is not. So flipping the order makes sense. > I'd change the comment though, maybe to > > /* > * Replace any outer-relation variables with nestloop params. > * > * We must provide nestloop params for both lateral references of > * the subquery and outer vars in the scan_clauses. It's better > * to assign the former first, because that code path requires > * specific param IDs, while replace_nestloop_params can adapt > * to the IDs assigned by process_subquery_nestloop_params. > * This avoids possibly duplicating nestloop params when the > * same Var is needed for both reasons. > */ +1. It's much better. > However ... it seems like we're not out of the woods yet. Why > is Richard's proposed test case still showing > > + -> Memoize (actual rows=5000 loops=N) > + Cache Key: t1.two, t1.two > > Seems like there is missing de-duplication logic, or something. When we collect the cache keys in paraminfo_get_equal_hashops() we search param_info's ppi_clauses as well as innerrel's lateral_vars for outer expressions. We do not perform de-duplication on the collected outer expressions there. In my proposed test case, the same Var 't1.two' appears both in the param_info->ppi_clauses and in the innerrel->lateral_vars, so we see two identical cache keys in the plan. I noticed this before and wondered if we should do de-duplication on the cache keys, but somehow I did not chase this to the ground. Thanks Richard
Re: gai_strerror() is not thread-safe on Windows
On Tue, 5 Dec 2023 at 00:57, Thomas Munro wrote: > > On second thoughts, I guess it would make more sense to use the exact > messages Windows' own implementation would return instead of whatever > we had in the past (probably cribbed from some other OS or just made > up?). I asked CI to spit those out[1]. Updated patch attached. Will > add to CF. CFBot shows that the patch does not apply anymore as in [1]: === Applying patches on top of PostgreSQL commit ID 376c216138c75e161d39767650ea30536f23b482 === === applying patch ./v2-0001-Fix-gai_strerror-thread-safety-on-Windows.patch patching file configure Hunk #1 succeeded at 16388 (offset 34 lines). patching file configure.ac Hunk #1 succeeded at 1885 (offset 7 lines). patching file src/include/port/win32/sys/socket.h patching file src/port/meson.build patching file src/port/win32gai_strerror.c can't find file to patch at input line 134 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -- |diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm |index 46df01cc8d..c51296bdb6 100644 |--- a/src/tools/msvc/Mkvcbuild.pm |+++ b/src/tools/msvc/Mkvcbuild.pm -- No file to patch. Skipping patch. 1 out of 1 hunk ignored Please have a look and post an updated version. [1] - http://cfbot.cputube.org/patch_46_4682.log Regards, Vignesh
Re: Improve tab completion for ALTER DEFAULT PRIVILEGE and ALTER TABLE
On Mon, 27 Nov 2023 at 21:58, vignesh C wrote: > > On Fri, 24 Nov 2023 at 18:37, Shubham Khanna > wrote: > > > > n Fri, Nov 24, 2023 at 6:33 PM vignesh C wrote: > > > > > > Hi, > > > > > > Improved tab completion for "ALTER DEFAULT PRIVILEGE" and "ALTER TABLE": > > > 1) GRANT, REVOKE and FOR USER keyword was not displayed in tab > > > completion of alter default privileges like the below statement: > > > ALTER DEFAULT PRIVILEGES GRANT INSERT ON tables TO PUBLIC; > > > ALTER DEFAULT PRIVILEGES REVOKE INSERT ON tables FROM PUBLIC; > > > ALTER DEFAULT PRIVILEGES FOR USER vignesh revoke INSERT ON tables FROM > > > dba1; > > > > > > 2) USER was not displayed for "ALTER DEFAULT PRIVILEGES IN SCHEMA > > > public FOR " like in below statement: > > > ALTER DEFAULT PRIVILEGES IN SCHEMA public FOR USER dba1 GRANT INSERT > > > ON TABLES TO PUBLIC; > > > > > > 3) "FOR GRANT OPTION" was not display for "ALTER DEFAULT PRIVILEGES > > > REVOKE " like in below statement: > > > alter default privileges revoke grant option for select ON tables FROM > > > dba1; > > > > > > 4) "DATA TYPE" was missing in "ALTER TABLE table-name ALTER COLUMN > > > column-name SET" like in: > > > ALTER TABLE t1 ALTER COLUMN c1 SET DATA TYPE text; > > > > > > Attached patch has the changes for the same. > > > > +COMPLETE_WITH("ROLE", "USER"); > > + /* ALTER DEFAULT PRIVILEGES REVOKE */ > > + else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "REVOKE")) > > +COMPLETE_WITH("SELECT", "INSERT", "UPDATE", "DELETE", "TRUNCATE", > > + "REFERENCES", "TRIGGER", "CREATE", "EXECUTE", "USAGE", > > + "MAINTAIN", "ALL", "GRANT OPTION FOR"); > > > > I could not find "alter default privileges revoke maintain", should > > this be removed? > > This was reverted as part of: > 151c22deee66a3390ca9a1c3675e29de54ae73fc. > Revert MAINTAIN privilege and pg_maintain predefined role. > > This reverts the following commits: 4dbdb82513, c2122aae63, > 5b1a879943, 9e1e9d6560, ff9618e82a, 60684dd834, 4441fc704d, > and b5d6382496. A role with the MAINTAIN privilege may be able to > use search_path tricks to escalate privileges to the table owner. > Unfortunately, it is too late in the v16 development cycle to apply > the proposed fix, i.e., restricting search_path when running > maintenance commands. > > The attached v2 version has the changes for the same. The patch was not applying because of a recent commit in tab completion, PSA new patch set. Regards, Vignesh From e3695181cf7d5f31dda520a01def544dc576d6c1 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Sun, 2 Jul 2023 19:35:46 +0530 Subject: [PATCH v3 1/2] Fix missing tab completion in "ALTER DEFAULT PRIVILEGES" GRANT, REVOKE and FOR USER keyword was not displayed in tab completion of alter default prvileges like the below statement: ALTER DEFAULT PRIVILEGES GRANT INSERT ON tables TO PUBLIC; ALTER DEFAULT PRIVILEGES REVOKE INSERT ON tables FROM PUBLIC; ALTER DEFAULT PRIVILEGES FOR USER vignesh revoke INSERT ON tables FROM vignesh; USER was not displayed for "ALTER DEFAULT PRIVILEGES IN SCHEMA public FOR " like in below statement: ALTER DEFAULT PRIVILEGES IN SCHEMA public FOR USER vignesh GRANT INSERT ON TABLES TO PUBLIC; "FOR GRANT OPTION" was not display for "ALTER DEFAULT PRIVILEGES REVOKE " like in below statement: alter default privileges revoke grant option for select ON tables FROM testdba; --- src/bin/psql/tab-complete.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index ada711d02f..d5b3794fd6 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -2144,10 +2144,15 @@ psql_completion(const char *text, int start, int end) /* ALTER DEFAULT PRIVILEGES */ else if (Matches("ALTER", "DEFAULT", "PRIVILEGES")) - COMPLETE_WITH("FOR ROLE", "IN SCHEMA"); + COMPLETE_WITH("FOR", "GRANT", "IN SCHEMA", "REVOKE"); /* ALTER DEFAULT PRIVILEGES FOR */ else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "FOR")) - COMPLETE_WITH("ROLE"); + COMPLETE_WITH("ROLE", "USER"); + /* ALTER DEFAULT PRIVILEGES REVOKE */ + else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "REVOKE")) + COMPLETE_WITH("SELECT", "INSERT", "UPDATE", "DELETE", "TRUNCATE", + "REFERENCES", "TRIGGER", "CREATE", "EXECUTE", "USAGE", + "ALL", "GRANT OPTION FOR"); /* ALTER DEFAULT PRIVILEGES IN */ else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN")) COMPLETE_WITH("SCHEMA"); @@ -2158,11 +2163,11 @@ psql_completion(const char *text, int start, int end) /* ALTER DEFAULT PRIVILEGES IN SCHEMA ... */ else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN", "SCHEMA", MatchAny)) - COMPLETE_WITH("GRANT", "REVOKE", "FOR ROLE"); + COMPLETE_WITH("GRANT", "REVOKE", "FOR"); /* ALTER DEFAULT PRIVILEGES IN SCHEMA ... FOR */ else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN", "SCHEMA", MatchAny, "FOR")) - COMPLETE_WITH("ROLE"); + COMPLETE_WITH("ROLE", "USER"); /*
Re: Evaluate arguments of correlated SubPlans in the referencing ExprState
On Tue, 24 Oct 2023 at 01:47, Alena Rybakina wrote: > > Hi! > > I looked through your patch and noticed that it was not applied to the > current version of the master. I rebased it and attached a version. I didn't > see any problems and, honestly, no big changes were needed, all regression > tests were passed. > > I think it's better to add a test, but to be honest, I haven't been able to > come up with something yet. The patch does not apply anymore as in CFBot at [1]: === Applying patches on top of PostgreSQL commit ID 7014c9a4bba2d1b67d60687afb5b2091c1d07f73 === === applying patch ./v2-0001-WIP-Evaluate-arguments-of-correlated-SubPlans-in-the.patch patching file src/include/executor/execExpr.h Hunk #1 succeeded at 160 (offset 1 line). Hunk #2 succeeded at 382 (offset 2 lines). Hunk #3 FAILED at 778. 1 out of 3 hunks FAILED -- saving rejects to file src/include/executor/execExpr.h.rej patching file src/include/nodes/execnodes.h Hunk #1 succeeded at 959 (offset 7 lines). Please have a look and post an updated version. [1] - http://cfbot.cputube.org/patch_46_4209.log Regards, Vignesh
Small fix on COPY ON_ERROR document
Hi, I found that the documentation of COPY ON_ERROR said COPY stops operation at the first error when "ON_ERROR is not specified.", but it also stop when ON_ERROR is specified to the default value, "stop". I attached a very small patch to fix this just for making the description more accurate. Regards, Yugo Nagata -- Yugo NAGATA diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 21a5c4a052..14c8ceab06 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -577,8 +577,8 @@ COPY count COPY stops operation at the first error when -ON_ERROR is not specified. This -should not lead to problems in the event of a COPY +ON_ERROR is not specified or stop. +This should not lead to problems in the event of a COPY TO, but the target table will already have received earlier rows in a COPY FROM. These rows will not be visible or accessible, but they still occupy disk space. This might
Re: Printing backtrace of postgres processes
On Sun, 5 Nov 2023 at 01:49, Bharath Rupireddy wrote: > > On Thu, Jul 20, 2023 at 8:22 PM Daniel Gustafsson wrote: > > > > > On 11 Jan 2023, at 15:44, Bharath Rupireddy > > > wrote: > > > > > > On Wed, Nov 30, 2022 at 11:43 AM Bharath Rupireddy > > > wrote: > > >> > > >> I'm attaching the v22 patch set for further review. > > > > > > Needed a rebase due to 216a784829c2c5f03ab0c43e009126cbb819e9b2. > > > Attaching v23 patch set for further review. > > > > This thread has stalled for well over 6 months with the patch going from CF > > to > > CF. From skimming the thread it seems that a lot of the details have been > > ironed out with most (all?) objections addressed. Is any committer > > interested > > in picking this up? If not we should probably mark it returned with > > feedback. > > Rebase needed due to function oid clash. Picked the new OID with the > help of src/include/catalog/unused_oids. PSA v24 patch set. Rebase needed due to changes in parallel_schedule. PSA v25 patch set. Regards, Vignesh From 32fa1d898b2599b836a2265f746acf230ffb358e Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sat, 4 Nov 2023 18:06:36 + Subject: [PATCH v25 1/2] Move sending multiplexed-SIGUSR1 signal code to a function Add a new function hosting the common code for sending multiplexed-SIGUSR1 signal to a backend process. This function will also be used as-is by an upcoming commit reducing the code duplication. --- src/backend/storage/ipc/procarray.c | 60 + src/backend/utils/adt/mcxtfuncs.c | 49 ++- src/include/storage/procarray.h | 1 + 3 files changed, 64 insertions(+), 46 deletions(-) diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index ee2d7f8585..f6465529e7 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -3097,6 +3097,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type) return result; } +/* + * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a + * backend or an auxiliary process and send it the SIGUSR1 signal for a given + * reason. + * + * Returns true if sending the signal was successful, false otherwise. + */ +bool +SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason) +{ + PGPROC *proc; + BackendId backendId = InvalidBackendId; + + proc = BackendPidGetProc(pid); + + /* + * See if the process with given pid is a backend or an auxiliary process. + * + * If the given process is a backend, use its backend id in + * SendProcSignal() later to speed up the operation. Otherwise, don't do + * that because auxiliary processes (except the startup process) don't + * have a valid backend id. + */ + if (proc != NULL) + backendId = proc->backendId; + else + proc = AuxiliaryPidGetProc(pid); + + /* + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid + * isn't valid; but by the time we reach kill(), a process for which we + * get a valid proc here might have terminated on its own. There's no way + * to acquire a lock on an arbitrary process to prevent that. But since + * this mechanism is usually used to debug a backend or an auxiliary + * process running and consuming lots of memory or a long running process, + * that it might end on its own first and its memory contexts are not + * logged or backtrace not logged is not a problem. + */ + if (proc == NULL) + { + /* + * This is just a warning so a loop-through-resultset will not abort + * if one backend terminated on its own during the run. + */ + ereport(WARNING, +(errmsg("PID %d is not a PostgreSQL server process", pid))); + return false; + } + + if (SendProcSignal(pid, reason, backendId) < 0) + { + /* Again, just a warning to allow loops */ + ereport(WARNING, +(errmsg("could not send signal to process %d: %m", pid))); + return false; + } + + return true; +} + /* * BackendPidGetProc -- get a backend's PGPROC given its PID * diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c index 4708d73f5f..f7b4f8dac1 100644 --- a/src/backend/utils/adt/mcxtfuncs.c +++ b/src/backend/utils/adt/mcxtfuncs.c @@ -145,51 +145,8 @@ Datum pg_log_backend_memory_contexts(PG_FUNCTION_ARGS) { int pid = PG_GETARG_INT32(0); - PGPROC *proc; - BackendId backendId = InvalidBackendId; + bool result; - proc = BackendPidGetProc(pid); - - /* - * See if the process with given pid is a backend or an auxiliary process. - * - * If the given process is a backend, use its backend id in - * SendProcSignal() later to speed up the operation. Otherwise, don't do - * that because auxiliary processes (except the startup process) don't - * have a valid backend id. - */ - if (proc != NULL) - backendId = proc->backendId; - else - proc = AuxiliaryPidGetProc(pid); - - /* - * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid - * isn't valid; but by the time
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
On Wed, 20 Dec 2023 at 19:17, Jelte Fennema-Nio wrote: > > On Thu, 14 Dec 2023 at 13:57, Jelte Fennema-Nio wrote: > > I changed all the places that were not adhering to those spellings. > > It seems I forgot a /g on my sed command to do this so it turned out I > missed one that caused the test to fail to compile... Attached is a > fixed version. > > I also updated the patchset to use the EOF detection provided by > 0a5c46a7a488f2f4260a90843bb9de6c584c7f4e instead of introducing a new > way of EOF detection using a -2 return value. CFBot shows that the patch does not apply anymore as in [1]: patching file doc/src/sgml/libpq.sgml ... patching file src/interfaces/libpq/exports.txt Hunk #1 FAILED at 191. 1 out of 1 hunk FAILED -- saving rejects to file src/interfaces/libpq/exports.txt.rej patching file src/interfaces/libpq/fe-connect.c Please post an updated version for the same. [1] - http://cfbot.cputube.org/patch_46_3511.log Regards, Vignesh
Re: [PATCH] Add native windows on arm64 support
On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer wrote: > On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan wrote: >> On 2024-01-25 Th 16:17, Dave Cramer wrote: >> Yeah, I think the default Developer Command Prompt for VS2022 is set up >> for x86 builds. AIUI you should start by executing "vcvarsall x64_arm64". > > Yup, now I'm in the same state you are Wait a minute here. Based on [1], x64_arm64 means you can use a x64 host and you'll be able to produce ARM64 builds, still these will not be able to run on the host where they were built. How much of the patch posted upthread is required to produce such builds? Basically everything from it, I guess, so as build dependencies can be satisfied? [1]: https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170 -- Michael signature.asc Description: PGP signature
RE: Fix inappropriate comments in function ReplicationSlotAcquire
On Thu, Jan 25, 2024 at 20:33 Amit Kapila wrote: > On Thu, Jan 25, 2024 at 4:01 PM Wei Wang (Fujitsu) > wrote: > > > > > > Yes, agree. I think these two parts have become slightly outdated after the > > commit 1632ea4. So also tried to fix the first part of the comment. > > Attach the new patch. > > > > How about changing it to something simple like: > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c > index f2781d0455..84c257a7aa 100644 > --- a/src/backend/replication/slot.c > +++ b/src/backend/replication/slot.c > @@ -465,10 +465,7 @@ retry: > > LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > - /* > -* Search for the slot with the specified name if the slot to acquire > is > -* not given. If the slot is not found, we either return -1 or > error out. > -*/ > +/* Check if the slot exits with the given name. */ > s = SearchNamedReplicationSlot(name, false); > if (s == NULL || !s->in_use) > { It looks good to me. So, I updated the patch as suggested. Regards, Wang Wei v3-0001-Fix-inappropriate-comments-in-function-Replicatio.patch Description: v3-0001-Fix-inappropriate-comments-in-function-Replicatio.patch
Re: Use of backup_label not noted in log
On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: > I would still advocate for a back patch here. It is frustrating to get logs > from users that just say: > > LOG: invalid checkpoint record > PANIC: could not locate a valid checkpoint record > > It would be very helpful to know what the checkpoint record LSN was in this > case. Yes, I've pested over this one in the past when debugging corruption issues. To me, this would just mean to appens to the PANIC an "at %X/%X", but perhaps you have more in mind for these code paths? -- Michael signature.asc Description: PGP signature
Rename setup_cancel_handler in pg_dump
Hi, Attached is a simple patch to rename setup_cancel_handler() in pg_dump/parallel.c. I am proposing it because there is a public function with the same name in fe_utils/cancel.c. I know pg_dump/parallel.c does not include fe_utils/cancel.h, so there is no conflict, but I think it is better to use different names to reduce possible confusion. I guess there was no concerns when setup_cancel_handler in pg_dump/parallel.c was introduced because the same name function was not in fe_utils that could be used in common between client tools.. The public setup_cancel_handler in fe_utils was introduced in a4fd3aa719e, where this function was moved from psql. Regards, Yugo Nagata -- Yugo NAGATA diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c index 188186829c..261b23cb3f 100644 --- a/src/bin/pg_dump/parallel.c +++ b/src/bin/pg_dump/parallel.c @@ -204,7 +204,7 @@ static ParallelSlot *GetMyPSlot(ParallelState *pstate); static void archive_close_connection(int code, void *arg); static void ShutdownWorkersHard(ParallelState *pstate); static void WaitForTerminatingWorkers(ParallelState *pstate); -static void setup_cancel_handler(void); +static void pg_dump_setup_cancel_handler(void); static void set_cancel_pstate(ParallelState *pstate); static void set_cancel_slot_archive(ParallelSlot *slot, ArchiveHandle *AH); static void RunWorker(ArchiveHandle *AH, ParallelSlot *slot); @@ -550,7 +550,7 @@ sigTermHandler(SIGNAL_ARGS) /* * Some platforms allow delivery of new signals to interrupt an active * signal handler. That could muck up our attempt to send PQcancel, so - * disable the signals that setup_cancel_handler enabled. + * disable the signals that pg_dump_setup_cancel_handler enabled. */ pqsignal(SIGINT, SIG_IGN); pqsignal(SIGTERM, SIG_IGN); @@ -605,7 +605,7 @@ sigTermHandler(SIGNAL_ARGS) * Enable cancel interrupt handler, if not already done. */ static void -setup_cancel_handler(void) +pg_dump_setup_cancel_handler(void) { /* * When forking, signal_info.handler_set will propagate into the new @@ -705,7 +705,7 @@ consoleHandler(DWORD dwCtrlType) * Enable cancel interrupt handler, if not already done. */ static void -setup_cancel_handler(void) +pg_dump_setup_cancel_handler(void) { if (!signal_info.handler_set) { @@ -737,7 +737,7 @@ set_archive_cancel_info(ArchiveHandle *AH, PGconn *conn) * important that this happen at least once before we fork off any * threads. */ - setup_cancel_handler(); + pg_dump_setup_cancel_handler(); /* * On Unix, we assume that storing a pointer value is atomic with respect
Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
On Tue, 2 Jan 2024 08:00:00 +0800 jian he wrote: > On Mon, Nov 6, 2023 at 8:00 AM jian he wrote: > > > > minor doc issues. > > Returns the chunk id of the TOASTed value, or NULL if the value is not > > TOASTed. > > Should it be "chunk_id"? Thank you for your suggestion. As you pointed out, it is called "chunk_id" in the documentation, so I rewrote it and also added a link to the section where the TOAST table structure is explained. > > you may place it after pg_create_logical_replication_slot entry to > > make it look like alphabetical order. I've been thinking about where we should place the function in the doc, and I decided place it in the table of Database Object Size Functions because I think pg_column_toast_chunk_id also would assist understanding the result of size functions as similar to pg_column_compression; that is, those function can explain why a large value in size could be stored in a column. > > There is no test. maybe we can add following to > > src/test/regress/sql/misc.sql > > create table val(t text); > > INSERT into val(t) SELECT string_agg( > > chr((ascii('B') + round(random() * 25)) :: integer),'') > > FROM generate_series(1,2500); > > select pg_column_toast_chunk_id(t) is not null from val; > > drop table val; Thank you for the test proposal. However, if we add a test, I want to check that the chunk_id returned by the function exists in the TOAST table, and that it returns NULL if the values is not TOASTed. For the purpose, I wrote a test using a dynamic SQL since the table name of the TOAST table have to be generated from the main table's OID. > Hi > the main C function (pg_column_toast_chunk_id) I didn't change. > I added tests as mentioned above. > tests put it on src/test/regress/sql/misc.sql, i hope that's fine. > I placed pg_column_toast_chunk_id in "Table 9.99. Database Object > Location Functions" (below Table 9.98. Database Object Size > Functions). I could not find any change in your patch from my previous patch. Maybe, you attached wrong file. I attached a patch updated based on your review, including the documentation fixes and a test. What do you think about this it? Regards, Yugo Nagata -- Yugo NAGATA >From 97af1b2300ecf07a34923da87a9d84e6aa963956 Mon Sep 17 00:00:00 2001 From: Yugo Nagata Date: Wed, 29 Mar 2023 09:59:25 +0900 Subject: [PATCH v3] Add pg_column_toast_chunk_id function This function returns the chunk_id of an on-disk TOASTed value, or NULL if the value is un-TOASTed or not on disk. This enables users to know which columns are actually TOASTed. This function is also useful to identify a problematic row when an error like "ERROR: unexpected chunk number ... (expected ...) for toast value" occurs. --- doc/src/sgml/func.sgml | 17 ++ src/backend/utils/adt/varlena.c | 41 + src/include/catalog/pg_proc.dat | 3 +++ 3 files changed, 61 insertions(+) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 210c7c0b02..2d82331323 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -28078,6 +28078,23 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset + + + + pg_column_toast_chunk_id + +pg_column_toast_chunk_id ( "any" ) +oid + + +Shows the chunk_id of an on-disk +TOASTed value. Returns NULL +if the value is un-TOASTed or not on-disk. +See for details about +TOAST. + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 543afb66e5..84d36781a4 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -5105,6 +5105,47 @@ pg_column_compression(PG_FUNCTION_ARGS) PG_RETURN_TEXT_P(cstring_to_text(result)); } +/* + * Return the chunk_id of the on-disk TOASTed value. + * Return NULL if the value is unTOASTed or not on disk. + */ +Datum +pg_column_toast_chunk_id(PG_FUNCTION_ARGS) +{ + int typlen; + struct varlena *attr; + struct varatt_external toast_pointer; + + /* On first call, get the input type's typlen, and save at *fn_extra */ + if (fcinfo->flinfo->fn_extra == NULL) + { + /* Lookup the datatype of the supplied argument */ + Oid argtypeid = get_fn_expr_argtype(fcinfo->flinfo, 0); + + typlen = get_typlen(argtypeid); + if (typlen == 0) /* should not happen */ + elog(ERROR, "cache lookup failed for type %u", argtypeid); + + fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, + sizeof(int)); + *((int *) fcinfo->flinfo->fn_extra) = typlen; + } + else + typlen = *((int *) fcinfo->flinfo->fn_extra); + + if (typlen != -1) + PG_RETURN_NULL(); + + attr = (struct varlena *) DatumGetPointer(PG_GETARG_DATUM(0)); + + if (!VARATT_IS_EXTERNAL_ONDISK(attr)) + PG_RETURN_NULL(); + + VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr); + +
Re: speed up a logical replica setup
On Thu, Jan 25, 2024, at 6:05 AM, Hayato Kuroda (Fujitsu) wrote: > 01. > ``` > /* Options */ > static char *pub_conninfo_str = NULL; > static SimpleStringList database_names = {NULL, NULL}; > static int wait_seconds = DEFAULT_WAIT; > static bool retain = false; > static bool dry_run = false; > ``` > > Just to confirm - is there a policy why we store the specified options? If you > want to store as global ones, username and port should follow (my fault...). > Or, should we have a structure to store them? It is a matter of style I would say. Check other client applications. Some of them also use global variable. There are others that group options into a struct. I would say that since it has a short lifetime, I don't think the current style is harmful. > 04. > ``` > {"dry-run", no_argument, NULL, 'n'}, > ``` > > I'm not sure why the dry_run mode exists. In terms pg_resetwal, it shows the > which value would be changed based on the input. As for the pg_upgrade, it > checks > whether the node can be upgraded for now. I think, we should have the checking > feature, so it should be renamed to --check. Also, the process should exit > earlier > at that time. It is extremely useful because (a) you have a physical replication setup and don't know if it is prepared for logical replication, (b) check GUCs (is max_wal_senders sufficient for this pg_subscriber command? Or is max_replication_slots sufficient to setup the logical replication even though I already have some used replication slots?), (c) connectivity and (d) credentials. > 05. > I felt we should accept some settings from enviroment variables, like > pg_upgrade. > Currently, below items should be acceted. > > - data directory > - username > - port > - timeout Maybe PGDATA. > 06. > ``` > pg_logging_set_level(PG_LOG_WARNING); > ``` > > If the default log level is warning, there are no ways to output debug logs. > (-v option only raises one, so INFO would be output) > I think it should be PG_LOG_INFO. You need to specify multiple -v options. > 07. > Can we combine verifications into two functions, e.g., check_primary() and > check_standby/check_subscriber()? I think v9 does it. > 08. > Not sure, but if we want to record outputs by pg_subscriber, the sub-directory > should be created. The name should contain the timestamp. The log file already contains the timestamp. Why? > 09. > Not sure, but should we check max_slot_wal_keep_size of primary server? It can > avoid to fail starting of logical replicaiton. A broken physical replication *before* running this tool is its responsibility? Hmm. We might add another check that can be noticed during dry run mode. > 10. > ``` > nslots_new = nslots_old + dbarr.ndbs; > > if (nslots_new > max_replication_slots) > { > pg_log_error("max_replication_slots (%d) must be greater than or equal to " > "the number of replication slots required (%d)", max_replication_slots, > nslots_new); > exit(1); > } > ``` > > I think standby server must not have replication slots. Because subsequent > pg_resetwal command discards all the WAL file, so WAL records pointed by them > are removed. Currently pg_resetwal does not raise ERROR at that time. Again, dry run mode might provide a message for it. > 11. > ``` > /* > * Stop the subscriber if it is a standby server. Before executing the > * transformation steps, make sure the subscriber is not running because > * one of the steps is to modify some recovery parameters that require a > * restart. > */ > if (stat(pidfile, ) == 0) > ``` > > I kept just in case, but I'm not sure it is still needed. How do you think? > Removing it can reduce an inclusion of pidfile.h. Are you suggesting another way to check if the standby is up and running? > 12. > ``` > pg_ctl_cmd = psprintf("\"%s/pg_ctl\" stop -D \"%s\" -s", > standby.bindir, standby.pgdata); > rc = system(pg_ctl_cmd); > pg_ctl_status(pg_ctl_cmd, rc, 0); > ``` > > > There are two places to stop the instance. Can you divide it into a function? Yes. > 13. > ``` > * A temporary replication slot is not used here to avoid keeping a > * replication connection open (depending when base backup was taken, the > * connection should be open for a few hours). > */ > conn = connect_database(primary.base_conninfo, dbarr.perdb[0].dbname); > if (conn == NULL) > exit(1); > consistent_lsn = create_logical_replication_slot(conn, true, > [0]); > ``` > > I didn't notice the comment, but still not sure the reason. Why we must > reserve > the slot until pg_subscriber finishes? IIUC, the slot would be never used, it > is created only for getting a consistent_lsn. So we do not have to keep. > Also, just before, logical replication slots for each databases are created, > so > WAL records are surely reserved. This comment needs to be updated. It was written at the time I was pursuing base backup support too. It doesn't matter if you remove this transient replication slot earlier because all of the replication slots created to the
Re: Make COPY format extendable: Extract COPY TO format implementations
On Thu, Jan 25, 2024 at 05:45:43PM +0900, Sutou Kouhei wrote: > In > "Re: Make COPY format extendable: Extract COPY TO format implementations" > on Thu, 25 Jan 2024 12:17:55 +0900, > Michael Paquier wrote: >> +extern CopyToRoutine CopyToRoutineText; >> +extern CopyToRoutine CopyToRoutineCSV; >> +extern CopyToRoutine CopyToRoutineBinary; >> >> All that should IMO remain in copyto.c and copyfrom.c in the initial >> patch doing the refactoring. Why not using a fetch function instead >> that uses a string in input? Then you can call that once after >> parsing the List of options in ProcessCopyOptions(). > > OK. How about the following for the fetch function > signature? > > extern CopyToRoutine *GetBuiltinCopyToRoutine(const char *format); Or CopyToRoutineGet()? I am not wedded to my suggestion, got a bad history with naming things around here. > We may introduce an enum and use it: > > typedef enum CopyBuiltinFormat > { > COPY_BUILTIN_FORMAT_TEXT = 0, > COPY_BUILTIN_FORMAT_CSV, > COPY_BUILTIN_FORMAT_BINARY, > } CopyBuiltinFormat; > > extern CopyToRoutine *GetBuiltinCopyToRoutine(CopyBuiltinFormat format); I am not sure that this is necessary as the option value is a string. > Oh, sorry. I assumed that the comment style was adjusted by > pgindent. No worries, that's just something we get used to. I tend to fix a lot of these things by myself when editing patches. >> +getTypeBinaryOutputInfo(attr->atttypid, _func_oid, ); >> +fmgr_info(out_func_oid, >out_functions[attnum - 1]); >> >> Actually, this split is interesting. It is possible for a custom >> format to plug in a custom set of out functions. Did you make use of >> something custom for your own stuff? > > I didn't. My PoC custom COPY format handler for Apache Arrow > just handles integer and text for now. It doesn't use > cstate->out_functions because cstate->out_functions may not > return a valid binary format value for Apache Arrow. So it > formats each value by itself. I mean, if you use a custom output function, you could tweak things even more with byteas or such.. If a callback is expected to do something, like setting the output function OIDs in the start callback, we'd better document it rather than letting that be implied. >> Actually, could it make sense to >> split the assignment of cstate->out_functions into its own callback? > > Yes. Because we need to use getTypeBinaryOutputInfo() for > "binary" and use getTypeOutputInfo() for "text" and "csv". Okay. After sleeping on it, a split makes sense here, because it also reduces the presence of TupleDesc in the start callback. >> Sure, that's part of the start phase, but at least it would make clear >> that a custom method *has* to assign these OIDs to work. The patch >> implies that as a rule, without a comment that CopyToStart *must* set >> up these OIDs. > > CopyToStart doesn't need to set up them if the handler > doesn't use cstate->out_functions. Noted. >> I think that 0001 and 0005 should be handled first, as pieces >> independent of the rest. Then we could move on with 0002~0004 and >> 0006~0008. > > OK. I'll focus on 0001 and 0005 for now. I'll restart > 0002-0004/0006-0008 after 0001 and 0005 are accepted. Once you get these, I'd be interested in re-doing an evaluation of COPY TO and more tests with COPY FROM while running Postgres on scissors. One thing I was thinking to use here is my blackhole_am for COPY FROM: https://github.com/michaelpq/pg_plugins/tree/main/blackhole_am As per its name, it does nothing on INSERT, so you could create a table using it as access method, and stress the COPY FROM execution paths without having to mount Postgres on a tmpfs because the data is sent to the void. Perhaps it does not matter, but that moves the tests to the bottlenecks we want to stress (aka the per-row callback for large data sets). I've switched the patch as waiting on author for now. Thanks for your perseverance here. I understand that's not easy to follow up with patches and reviews (^_^;) -- Michael signature.asc Description: PGP signature
Re: speed up a logical replica setup
On Tue, Jan 23, 2024, at 10:29 PM, Euler Taveira wrote: > I'll post a new one soon. I'm attaching another patch that fixes some of the issues pointed out by Hayato, Shlok, and Junwang. * publication doesn't exist. The analysis [1] was done by Hayato but I didn't use the proposed patch. Instead I refactored the code a bit [2] and call it from a new function (setup_publisher) that is called before the promotion. * fix wrong path name in the initial comment [3] * change terminology: logical replica -> physical replica [3] * primary / standby is ready for logical replication? setup_publisher() and setup_subscriber() check if required GUCs are set accordingly. For primary, it checks wal_level = logical, max_replication_slots has remain replication slots for the proposed setup and also max_wal_senders available. For standby, it checks max_replication_slots for replication origin and also remain number of background workers to start the subscriber. * retain option: I extracted this one from Hayato's patch [4] * target server must be a standby. It seems we agree that this restriction simplifies the code a bit but can be relaxed in the future (if/when base backup support is added.) * recovery timeout option: I decided to include it but I think the use case is too narrow. It helps in broken setups. However, it can be an issue in some scenarios like time-delayed replica, large replication lag, slow hardware, slow network. I didn't use the proposed patch [5]. Instead, I came up with a simple one that defaults to forever. The proposed one defaults to 60 seconds but I'm afraid that due to one of the scenarios I said in a previous sentence, we cancel a legitimate case. Maybe we should add a message during dry run saying that due to a replication lag, it will take longer to run. * refactor primary_slot_name code. With the new setup_publisher and setup_subscriber functions, I splitted the function that detects the primary_slot_name use into 2 pieces just to avoid extra connections to have the job done. * remove fallback_application_name as suggested by Hayato [5] because logical replication already includes one. I'm still thinking about replacing --subscriber-conninfo with separate items (username, port, password?, host = socket dir). Maybe it is an overengineering. The user can always prepare the environment to avoid unwanted and/or external connections. I didn't change the name from pg_subscriber to pg_createsubscriber yet but if I didn't hear objections about it, I'll do it in the next patch. [1] https://www.postgresql.org/message-id/TY3PR01MB9889C5D55206DDD978627D07F5752%40TY3PR01MB9889.jpnprd01.prod.outlook.com [2] https://www.postgresql.org/message-id/73ab86ca-3fd5-49b3-9c80-73d1525202f1%40app.fastmail.com [3] https://www.postgresql.org/message-id/TY3PR01MB9889678E47B918F4D83A6FD8F57B2%40TY3PR01MB9889.jpnprd01.prod.outlook.com [4] https://www.postgresql.org/message-id/TY3PR01MB9889678E47B918F4D83A6FD8F57B2%40TY3PR01MB9889.jpnprd01.prod.outlook.com [5] https://www.postgresql.org/message-id/TY3PR01MB9889593399165B9A04106741F5662%40TY3PR01MB9889.jpnprd01.prod.outlook.com -- Euler Taveira EDB https://www.enterprisedb.com/ From d9ef01a806c3d8697faa444283f19c2deaa58850 Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Mon, 5 Jun 2023 14:39:40 -0400 Subject: [PATCH v9] Creates a new logical replica from a standby server A new tool called pg_subscriber can convert a physical replica into a logical replica. It runs on the target server and should be able to connect to the source server (publisher) and the target server (subscriber). The conversion requires a few steps. Check if the target data directory has the same system identifier than the source data directory. Stop the target server if it is running as a standby server. Create one replication slot per specified database on the source server. One additional replication slot is created at the end to get the consistent LSN (This consistent LSN will be used as (a) a stopping point for the recovery process and (b) a starting point for the subscriptions). Write recovery parameters into the target data directory and start the target server (Wait until the target server is promoted). Create one publication (FOR ALL TABLES) per specified database on the source server. Create one subscription per specified database on the target server (Use replication slot and publication created in a previous step. Don't enable the subscriptions yet). Sets the replication progress to the consistent LSN that was got in a previous step. Enable the subscription for each specified database on the target server. Remove the additional replication slot that was used to get the consistent LSN. Stop the target server. Change the system identifier from the target server. Depending on your workload and database size, creating a logical replica couldn't be an option due to resource constraints (WAL backlog should be available until all table data
Re: Guiding principle for dropping LLVM versions?
On 1/25/24 13:41, Thomas Munro wrote: On Thu, Jan 25, 2024 at 4:44 PM Thomas Munro wrote: ... A few build farm animals will now fail in the configure step as discussed, and need some adjustment (ie disable LLVM or upgrade to LLVM 10+ for the master branch). Owners pinged. I think I fixed up the 4 or 6 under my name... Regards, Mark
Re: [PATCH] Add native windows on arm64 support
On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan wrote: > > On 2024-01-25 Th 16:17, Dave Cramer wrote: > > > > On Thu, 25 Jan 2024 at 16:04, Anthony Roberts > wrote: > >> Hi David, >> >> Unix "file" or "dumpbin /headers" in vcvarsall are your best bets. >> >> Thanks, >> Anthony >> > > > So there is another way, select the file in Windows Explorer and right > click, in the compatibility tab if the "Windows on ARM" is greyed out it is > an arm binary. > > So far mine are not :( > > > Yeah, I think the default Developer Command Prompt for VS2022 is set up > for x86 builds. AIUI you should start by executing "vcvarsall x64_arm64". > Yup, now I'm in the same state you are Dave
Re: Use of backup_label not noted in log
On 1/25/24 17:42, Tom Lane wrote: David Steele writes: Another thing to note here -- knowing the LSN is important but also knowing that backup recovery was attempted (i.e. backup_label exists) is really crucial. Knowing both just saves so much time in back and forth debugging. It appears the tally for back patching is: For: Andres, David, Michael B Not Sure: Robert, Laurenz, Michael P It seems at least nobody is dead set against it. We're talking about 1d35f705e, right? That certainly looks harmless and potentially useful. I'm +1 for back-patching. That's the one. If we were modifying existing messages I would be against it, but new, infrequent (but oh so helpful) messages seem fine. Regards, -David
Re: Use of backup_label not noted in log
David Steele writes: > Another thing to note here -- knowing the LSN is important but also > knowing that backup recovery was attempted (i.e. backup_label exists) is > really crucial. Knowing both just saves so much time in back and forth > debugging. > It appears the tally for back patching is: > For: Andres, David, Michael B > Not Sure: Robert, Laurenz, Michael P > It seems at least nobody is dead set against it. We're talking about 1d35f705e, right? That certainly looks harmless and potentially useful. I'm +1 for back-patching. regards, tom lane
Re: Guiding principle for dropping LLVM versions?
On Thu, Jan 25, 2024 at 4:44 PM Thomas Munro wrote: > ... A few build farm animals will > now fail in the configure step as discussed, and need some adjustment > (ie disable LLVM or upgrade to LLVM 10+ for the master branch). Owners pinged.
Re: Use of backup_label not noted in log
On 1/25/24 09:29, Michael Banck wrote: Hi, On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: I would still advocate for a back patch here. It is frustrating to get logs from users that just say: LOG: invalid checkpoint record PANIC: could not locate a valid checkpoint record It would be very helpful to know what the checkpoint record LSN was in this case. I agree. Another thing to note here -- knowing the LSN is important but also knowing that backup recovery was attempted (i.e. backup_label exists) is really crucial. Knowing both just saves so much time in back and forth debugging. It appears the tally for back patching is: For: Andres, David, Michael B Not Sure: Robert, Laurenz, Michael P It seems at least nobody is dead set against it. Regards, -David
Re: [PATCH] Add native windows on arm64 support
On 2024-01-25 Th 16:17, Dave Cramer wrote: On Thu, 25 Jan 2024 at 16:04, Anthony Roberts wrote: Hi David, Unix "file" or "dumpbin /headers" in vcvarsall are your best bets. Thanks, Anthony So there is another way, select the file in Windows Explorer and right click, in the compatibility tab if the "Windows on ARM" is greyed out it is an arm binary. So far mine are not :( Yeah, I think the default Developer Command Prompt for VS2022 is set up for x86 builds. AIUI you should start by executing "vcvarsall x64_arm64". cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: More new SQL/JSON item methods
On 2024-01-25 Th 15:58, Tom Lane wrote: I wrote: There's something else going on, because I'm still getting the assertion failure on my Mac with this fix in place. Annoyingly, it goes away if I compile with -O0, so it's kind of hard to identify what's going wrong. No, belay that: I must've got confused about which version I was testing. It's very unclear to me why the undefined reference causes the preceding Assert to misbehave, but that is clearly what's happening. Compiler bug maybe? My Mac has clang 15.0.0, and the unhappy buildfarm members are also late-model clang. Anyway, I did note that the preceding line res = jperOk; is dead code and might as well get removed while you're at it. OK, pushed those. Thanks. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PATCH] Add native windows on arm64 support
On Thu, 25 Jan 2024 at 16:04, Anthony Roberts wrote: > Hi David, > > Unix "file" or "dumpbin /headers" in vcvarsall are your best bets. > > Thanks, > Anthony > So there is another way, select the file in Windows Explorer and right click, in the compatibility tab if the "Windows on ARM" is greyed out it is an arm binary. So far mine are not :( Thanks, Dave >
Re: [PATCH] Add native windows on arm64 support
On 2024-01-25 Th 15:56, Dave Cramer wrote: On Thu, 25 Jan 2024 at 14:31, Andrew Dunstan wrote: On 2024-01-25 Th 08:45, Dave Cramer wrote: I tried running my buildfarm using my git repo and my branch, but get the following error Status Line: 492 bad branch parameter Content: bad branch parameter fix_arm Web txn failed with status: 1 You can't use your own branch with the buildfarm, you need to let it set up its own branches. So I guess I have to wait until this patch is merged in ? Not quite. There's a switch that lets you test your own code. If you say --from-source /path/to/repo it will run in whatever state the repo is in. It won't care what branch you are on, and it won't try to update the repo. It won't report the results to the server, but it will build and test everything just like in a regular run. (On the "eat your own dog food" principle I use this mode a lot.) If you use that mode you probably also want to specify --verbose as well. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: Remove unused fields in ReorderBufferTupleBuf
On Thu, 2024-01-25 at 16:16 +0300, Aleksander Alekseev wrote: > Hi, > > > > Here is the corrected patch. > > > > Thank you for updating the patch! I have some comments: > > Thanks for the review. > > > -tuple = (ReorderBufferTupleBuf *) > > +tuple = (HeapTuple) > > MemoryContextAlloc(rb->tup_context, > > - > > sizeof(ReorderBufferTupleBuf) + > > + HEAPTUPLESIZE + > > MAXIMUM_ALIGNOF + > > alloc_len); > > -tuple->alloc_tuple_size = alloc_len; > > -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple); > > +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE); > > > > Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the > > MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a > > similar thing but it doesn't add it. > > Indeed. I gave it a try and nothing crashed, so it appears that > MAXIMUM_ALIGNOF is not needed. > > > --- > > xl_parameter_change *xlrec = > > -(xl_parameter_change *) > > XLogRecGetData(buf->record); > > +(xl_parameter_change *) > > XLogRecGetData(buf->record); > > > > Unnecessary change. > > That's pgindent. Fixed. > > > --- > > -/* > > - * Free a ReorderBufferTupleBuf. > > - */ > > -void > > -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf > > *tuple) > > -{ > > -pfree(tuple); > > -} > > - > > > > Why does ReorderBufferReturnTupleBuf need to be moved from > > reorderbuffer.c to reorderbuffer.h? It seems not related to this > > refactoring patch so I think we should do it in a separate patch if we > > really want it. I'm not sure it's necessary, though. > > OK, fixed. > I walked through v6 and didn't note any issues. I do want to ask, the patch alters ReorderBufferReturnTupleBuf() and drops the unused parameter ReorderBuffer *rb. It seems that ReorderBufferFreeSnap(), ReorderBufferReturnRelids(), ReorderBufferImmediateInvalidation(), and ReorderBufferRestoreCleanup() also pass ReorderBuffer *rb but do not use it. Would it be beneficial to implement a separate patch to remove this parameter from these functions also? Thanks, Reid
Re: [PATCH] Add native windows on arm64 support
Hi David, Unix "file" or "dumpbin /headers" in vcvarsall are your best bets. Thanks, Anthony On Thu, 25 Jan 2024, 21:01 Dave Cramer, wrote: > > > On Thu, 25 Jan 2024 at 12:30, Andrew Dunstan wrote: > >> >> On 2024-01-24 We 19:02, Michael Paquier wrote: >> >> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote: >> >> I managed to get it to build the vcvarsall arch needs to be x64. I need to >> add some options, but the patch above needs to be applied to build it. >> >> Nice. If I may ask, what kind of host and/or configuration have you >> used to reach a state where the code can be compiled and run tests >> with meson? If you have found specific steps, it may be a good thing >> to document that on the wiki, say around [1]. >> >> Perhaps you have not included TAP? It may be fine in terms of runtime >> checks and coverage. >> >> [1]: >> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows >> >> >> >> I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we >> really want to build with x64_arm64, i.e. to generate native arm64 >> binaries. Setting just x64 will not do that, AIUI. >> >> I tried that with the buidfarm, setting that in the config file's call to >> PGBuild::VSenv::getenv(). >> >> That upset msvc_gendef.pl, so I added this there to keep it happy: >> >> $arch = 'x86_64' if $arch eq 'aarch64'; >> >> After that things went ok until I got this: >> >> [1453/2088] "link" @src/backend/postgres.exe.rsp >> FAILED: src/backend/postgres.exe src/backend/postgres.pdb >> "link" @src/backend/postgres.exe.rsp >>Creating library src\backend\postgres.exe.lib >> storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol >> _mm_pause referenced in function perform_spin_delay >> src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals >> >> >> I haven't made further progress, but I will return to it in the next day >> or so. >> >> While this will be nice to have, I think it won't really matter until >> there is ARM64 support in released versions of Windows Server. AFAICT they >> still only sell versions for x86_64 >> > > I've tried it with my patch attached previously and x64_arm64 and it works > fine. builds using the buildfarm as well. > > Is there a definitive way to figure out if the binaries are x64_arm64 > > Dave >
Re: More new SQL/JSON item methods
On 2024-01-25 Th 15:33, Tom Lane wrote: Andrew Dunstan writes: On 2024-01-25 Th 14:31, Tom Lane wrote: (The reported crashes seem to be happening later during a recursive invocation, seemingly because JsonbType(jb) is returning garbage. So there may be another bug after this one.) I don't think so. AIUI The first call deals with the '$' and the second one deals with the '.string()', which is why we see the error on the second call. There's something else going on, because I'm still getting the assertion failure on my Mac with this fix in place. Annoyingly, it goes away if I compile with -O0, so it's kind of hard to identify what's going wrong. Curiouser and curiouser. On my Mac the error is manifest but the fix you suggested cures it. Built with -O2 -g, clang 15.0.0, Apple Silicon. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PATCH] Add native windows on arm64 support
On Thu, 25 Jan 2024 at 12:30, Andrew Dunstan wrote: > > On 2024-01-24 We 19:02, Michael Paquier wrote: > > On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote: > > I managed to get it to build the vcvarsall arch needs to be x64. I need to > add some options, but the patch above needs to be applied to build it. > > Nice. If I may ask, what kind of host and/or configuration have you > used to reach a state where the code can be compiled and run tests > with meson? If you have found specific steps, it may be a good thing > to document that on the wiki, say around [1]. > > Perhaps you have not included TAP? It may be fine in terms of runtime > checks and coverage. > > [1]: > https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows > > > > I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we really > want to build with x64_arm64, i.e. to generate native arm64 binaries. > Setting just x64 will not do that, AIUI. > > I tried that with the buidfarm, setting that in the config file's call to > PGBuild::VSenv::getenv(). > > That upset msvc_gendef.pl, so I added this there to keep it happy: > > $arch = 'x86_64' if $arch eq 'aarch64'; > > After that things went ok until I got this: > > [1453/2088] "link" @src/backend/postgres.exe.rsp > FAILED: src/backend/postgres.exe src/backend/postgres.pdb > "link" @src/backend/postgres.exe.rsp >Creating library src\backend\postgres.exe.lib > storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol > _mm_pause referenced in function perform_spin_delay > src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals > > > I haven't made further progress, but I will return to it in the next day > or so. > > While this will be nice to have, I think it won't really matter until > there is ARM64 support in released versions of Windows Server. AFAICT they > still only sell versions for x86_64 > I've tried it with my patch attached previously and x64_arm64 and it works fine. builds using the buildfarm as well. Is there a definitive way to figure out if the binaries are x64_arm64 Dave
[Doc] Improvements to ddl.sgl Privileges Section and Glossary
Hey, In a nearby user complaint email [1] some missing information regarding ownership reassignment came to light. I took that and went a bit further to add what I felt was further missing information and context for how the privilege system is designed. I've tried to formalize and label existing concepts a bit and updated the glossary accordingly. The attached is a partial rewrite of the patch on the linked post after those comments were taken into consideration. The new glossary entry for privileges defines various qualifications of the term that are used in the new prose. I've marked up each of the variants and point them all back to the main entry. I didn't try to incorporate those terms, or even really look, anywhere else in the documentation. If the general idea is accepted that kind of work can be done as a follow-up. David J. [1] https://www.postgresql.org/message-id/d294818d12280f6223ddf169ab5454927f5186b6.camel%40cybertec.at From a4d6a599a0b5d6b8f280e3d8489e7f4a4a555383 Mon Sep 17 00:00:00 2001 From: "David G. Johnston" Date: Thu, 25 Jan 2024 13:41:48 -0700 Subject: [PATCH] v1-improvements-to-ddl-priv Section --- doc/src/sgml/ddl.sgml | 109 ++--- doc/src/sgml/glossary.sgml | 40 ++ 2 files changed, 105 insertions(+), 44 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index fc03a349f0..7c9c9d0dd1 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -1855,12 +1855,33 @@ ALTER TABLE products RENAME TO items; - When an object is created, it is assigned an owner. The - owner is normally the role that executed the creation statement. - For most kinds of objects, the initial state is that only the owner - (or a superuser) can do anything with the object. To allow - other roles to use it, privileges must be - granted. + The permissions needed to interact with an object are split between + privileges and + owner rights. + The owner has the right to modify the object, including dropping it, as well as + the right to grant and revoke privileges on the object to any role in the system. + Aside from a brief description, at the end, of what happens when you reassign the + owner of an object to some other role, this section describes privileges. + + + + For most databases there are many roles and many objects, and thus a large amount + of privileges that need to be defined. Two features exist to make management of + privileges easier: role membership (i.e., group roles, see ) + including the PUBLIC pseudo-role (which is a group role consisting + of all roles in the system) is one, while + default privileges + (see ) is the other. + + + + The fundamental design of the privilege system is to disallow by default. A role + must, directly or indirectly (via group role inheritance), hold the correct privilege + on the object to peform the corresponding action. + Furthermore, inheritance is strictly additive, there is no mechanism to block an + indirect privilege. + Revoking a privilege only removes + direct privileges. @@ -1878,21 +1899,14 @@ ALTER TABLE products RENAME TO items; - The right to modify or destroy an object is inherent in being the - object's owner, and cannot be granted or revoked in itself. - (However, like all privileges, that right can be inherited by - members of the owning role; see .) - - - - An object can be assigned to a new owner with an ALTER - command of the appropriate kind for the object, for example - -ALTER TABLE table_name OWNER TO new_owner; - - Superusers can always do this; ordinary roles can only do it if they are - both the current owner of the object (or inherit the privileges of the - owning role) and able to SET ROLE to the new owning role. + Upon object creation, two sets of + implicit privileges + are established. + The owner is granted all applicable privileges on the object. All other roles, + including the PUBLIC pseudo-role, get their default privileges. + From this point onward only + explicit priviliege + modification is possible, and is described next. @@ -1904,15 +1918,9 @@ ALTER TABLE table_name OWNER TO new_owne GRANT UPDATE ON accounts TO joe; Writing ALL in place of a specific privilege grants all - privileges that are relevant for the object type. - - - - The special role name PUBLIC can - be used to grant a privilege to every role on the system. Also, - group roles can be set up to help manage privileges when - there are many users of a database for details see - . + privileges that are relevant for the object type. The pseudo-role + PUBLIC is a valid role for this purpose. All roles in the system + will immediately inherit the indirect privilege(s) named in the command. @@ -1936,9 +1944,11 @@ REVOKE ALL ON accounts FROM PUBLIC; An object's owner can choose to revoke
Re: More new SQL/JSON item methods
I wrote: > There's something else going on, because I'm still getting the > assertion failure on my Mac with this fix in place. Annoyingly, > it goes away if I compile with -O0, so it's kind of hard to > identify what's going wrong. No, belay that: I must've got confused about which version I was testing. It's very unclear to me why the undefined reference causes the preceding Assert to misbehave, but that is clearly what's happening. Compiler bug maybe? My Mac has clang 15.0.0, and the unhappy buildfarm members are also late-model clang. Anyway, I did note that the preceding line res = jperOk; is dead code and might as well get removed while you're at it. regards, tom lane
Re: [PATCH] Add native windows on arm64 support
On Thu, 25 Jan 2024 at 14:31, Andrew Dunstan wrote: > > On 2024-01-25 Th 08:45, Dave Cramer wrote: > > > > > > I tried running my buildfarm using my git repo and my branch, but get the > following error > Status Line: 492 bad branch parameter > Content: > bad branch parameter fix_arm > > Web txn failed with status: 1 > > > > You can't use your own branch with the buildfarm, you need to let it set > up its own branches. > So I guess I have to wait until this patch is merged in ? Dave
Re: proposal: psql: show current user in prompt
Hi čt 11. 1. 2024 v 12:12 odesílatel Jelte Fennema-Nio napsal: > On Tue, 12 Sept 2023 at 09:46, Peter Eisentraut > wrote: > > ISTM that for a purpose like pgbouncer, it would be simpler to add a new > > GUC "report these variables" and send that in the startup message? That > > might not help with the psql use case, but it would be much simpler. > > FYI I implemented it that way yesterday on this other thread (patch > 0010 of that patchset): > > https://www.postgresql.org/message-id/flat/CAGECzQScQ3N-Ykv2j4NDyDtrPPc3FpRoa%3DLZ-2Uj2ocA4zr%3D4Q%40mail.gmail.com#cd9e8407820d492e8f677ee6a67c21ce I read your patch, and I see some advantages and some disadvantages. 1. this doesn't need new protocol API just for this feature, what is nice 2. using GUC for all reported GUC looks not too readable. Maybe it should be used just for customized reporting, not for default 3. Another issue of your proposal is less friendly enabling disabling reporting of specific GUC. Maintaining a list needs more work than just enabling and disabling one specific GUC. I think this is the main disadvantage of your proposal. In my proposal I don't need to know the state of any GUC. Just I send PQlinkParameterStatus or PQunlinkParameterStatus. With your proposal, I need to read _pq_.report_parameters, parse it, and modify, and send it back. This doesn't look too practical. Personally I prefer usage of a more generic API than my PQlinkParameterStatus and PQunlinkParameterStatus. You talked about it with Robert If I remember. Can be nice if I can write just /* similar princip is used inside ncurses */ set_report_guc_message_no = PQgetMessageNo("set_report_guc"); /* the result can be processed by server and by all proxies on the line */ if (set_report_guc_message_no == -1) fprintf(stderr, "feature is not supported"); result = PQsendMessage(set_report_guc_message, "user"); if (result == -1) fprintf(stderr, "some error ..."); With some API like this it can be easier to do some small protocol enhancement. Maybe this is overengineering. Enhancing protocol is not too common, and with usage PQsendTypedCommand enhancing protocol is less work too. Regards Pavel
Re: More new SQL/JSON item methods
Andrew Dunstan writes: > On 2024-01-25 Th 14:31, Tom Lane wrote: >> (The reported crashes seem to be happening later during a >> recursive invocation, seemingly because JsonbType(jb) is >> returning garbage. So there may be another bug after this one.) > I don't think so. AIUI The first call deals with the '$' and the second > one deals with the '.string()', which is why we see the error on the > second call. There's something else going on, because I'm still getting the assertion failure on my Mac with this fix in place. Annoyingly, it goes away if I compile with -O0, so it's kind of hard to identify what's going wrong. regards, tom lane
Re: More new SQL/JSON item methods
On 2024-01-25 Th 14:31, Tom Lane wrote: Andrew Dunstan writes: Thanks, I have pushed this. The buildfarm is pretty widely unhappy, mostly failing on select jsonb_path_query('1.23', '$.string()'); On a guess, I tried running that under valgrind, and behold it said ==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised value(s) ==00:00:00:05.637 435530==at 0x8FD131: executeItemOptUnwrapTarget (jsonpath_exec.c:1547) ==00:00:00:05.637 435530==by 0x8FED03: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530==by 0x8FED03: executeNextItem (jsonpath_exec.c:1604) ==00:00:00:05.637 435530==by 0x8FCA58: executeItemOptUnwrapTarget (jsonpath_exec.c:956) ==00:00:00:05.637 435530==by 0x8FFDE4: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530==by 0x8FFDE4: executeJsonPath.constprop.30 (jsonpath_exec.c:612) ==00:00:00:05.637 435530==by 0x8FFF8C: jsonb_path_query_internal (jsonpath_exec.c:438) It's fairly obviously right about that: JsonbValuejbv; ... jb = Assert(tmp != NULL);/* We must have set tmp above */ jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); ^ Presumably, this is a mistaken attempt to test the type of the thing previously pointed to by "jb". On the whole, what I'd be inclined to do here is get rid of this test altogether and demand that every path through the preceding "switch" deliver a value that doesn't need pstrdup(). The only path that doesn't do that already is case jbvBool: tmp = (jb->val.boolean) ? "true" : "false"; break; and TBH I'm not sure that we really need a pstrdup there either. The constants are immutable enough. Is something likely to try to pfree the pointer later? I tried @@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, jb = Assert(tmp != NULL);/* We must have set tmp above */ - jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); + jb->val.string.val = tmp; jb->val.string.len = strlen(jb->val.string.val); jb->type = jbvString; and that quieted valgrind for this particular query and still passes regression. Your fix looks sane. I also don't see why we need the pstrdup. (The reported crashes seem to be happening later during a recursive invocation, seemingly because JsonbType(jb) is returning garbage. So there may be another bug after this one.) I don't think so. AIUI The first call deals with the '$' and the second one deals with the '.string()', which is why we see the error on the second call. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Relation bulk write facility
On 10/01/2024 18:17, Robert Haas wrote: I think we should try to pick prefixes that are one or more words rather than using word fragments. bulkw is an awkward prefix even for people whose first language is English, and probably more awkward for others. Renamed the 'bulkw' variables to 'bulkstate, and the functions to have smgr_bulk_* prefix. I was tempted to use just bulk_* as the prefix, but I'm afraid e.g. bulk_write() is too generic. On 22/01/2024 07:50, Peter Smith wrote: Hi, This patch has a CF status of "Needs Review" [1], but it seems there was a CFbot test failure last time it was run [2]. Please have a look and post an updated version if necessary. Fixed the headerscheck failure by adding appropriate #includes. -- Heikki Linnakangas Neon (https://neon.tech) From c2e8cff9326fb874b2e1643f5c3c8a4952eaa3ac Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 25 Jan 2024 21:40:46 +0200 Subject: [PATCH v5 1/1] Introduce a new bulk loading facility. The new facility makes it easier to optimize bulk loading, as the logic for buffering, WAL-logging, and syncing the relation only needs to be implemented once. It's also less error-prone: We have had a number of bugs in how a relation is fsync'd - or not - at the end of a bulk loading operation. By centralizing that logic to one place, we only need to write it correctly once. The new facility is faster for small relations: Instead of of calling smgrimmedsync(), we register the fsync to happen at next checkpoint, which avoids the fsync latency. That can make a big difference if you are e.g. restoring a schema-only dump with lots of relations. It is also slightly more efficient with large relations, as the WAL logging is performed multiple pages at a time. That avoids some WAL header overhead. The sorted GiST index build did that already, this moves the buffering to the new facility. The changes to pageinspect GiST test needs an explanation: Before this patch, the sorted GiST index build set the LSN on every page to the special GistBuildLSN value, not the LSN of the WAL record, even though they were WAL-logged. There was no particular need for it, it just happened naturally when we wrote out the pages before WAL-logging them. Now we WAL-log the pages first, like in B-tree build, so the pages are stamped with the record's real LSN. When the build is not WAL-logged, we still use GistBuildLSN. To make the test output predictable, use an unlogged index. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/30e8f366-58b3-b239-c521-422122dd5150%40iki.fi --- contrib/pageinspect/expected/gist.out | 14 +- contrib/pageinspect/sql/gist.sql | 16 +- src/backend/access/gist/gistbuild.c | 122 +++ src/backend/access/heap/rewriteheap.c | 72 ++ src/backend/access/nbtree/nbtree.c| 33 +-- src/backend/access/nbtree/nbtsort.c | 135 src/backend/access/spgist/spginsert.c | 49 ++--- src/backend/catalog/storage.c | 46 ++-- src/backend/storage/smgr/Makefile | 1 + src/backend/storage/smgr/bulk_write.c | 303 ++ src/backend/storage/smgr/md.c | 45 +++- src/backend/storage/smgr/meson.build | 1 + src/backend/storage/smgr/smgr.c | 31 +++ src/include/storage/bulk_write.h | 40 src/include/storage/md.h | 1 + src/include/storage/smgr.h| 1 + src/tools/pgindent/typedefs.list | 3 + 17 files changed, 558 insertions(+), 355 deletions(-) create mode 100644 src/backend/storage/smgr/bulk_write.c create mode 100644 src/include/storage/bulk_write.h diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out index d1adbab8ae2..2b1d54a6279 100644 --- a/contrib/pageinspect/expected/gist.out +++ b/contrib/pageinspect/expected/gist.out @@ -1,13 +1,6 @@ --- The gist_page_opaque_info() function prints the page's LSN. Normally, --- that's constant 1 (GistBuildLSN) on every page of a freshly built GiST --- index. But with wal_level=minimal, the whole relation is dumped to WAL at --- the end of the transaction if it's smaller than wal_skip_threshold, which --- updates the LSNs. Wrap the tests on gist_page_opaque_info() in the --- same transaction with the CREATE INDEX so that we see the LSNs before --- they are possibly overwritten at end of transaction. -BEGIN; --- Create a test table and GiST index. -CREATE TABLE test_gist AS SELECT point(i,i) p, i::text t FROM +-- The gist_page_opaque_info() function prints the page's LSN. +-- Use an unlogged index, so that the LSN is predictable. +CREATE UNLOGGED TABLE test_gist AS SELECT point(i,i) p, i::text t FROM generate_series(1,1000) i; CREATE INDEX test_gist_idx ON test_gist USING gist (p); -- Page 0 is the root, the rest are leaf pages @@ -29,7 +22,6 @@ SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2)); 0/1 | 0/0 | 1 | {leaf} (1 row) -COMMIT; SELECT * FROM
Re: More new SQL/JSON item methods
On 2024-01-25 Th 14:31, Tom Lane wrote: Andrew Dunstan writes: Thanks, I have pushed this. The buildfarm is pretty widely unhappy, mostly failing on select jsonb_path_query('1.23', '$.string()'); On a guess, I tried running that under valgrind, and behold it said ==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised value(s) ==00:00:00:05.637 435530==at 0x8FD131: executeItemOptUnwrapTarget (jsonpath_exec.c:1547) ==00:00:00:05.637 435530==by 0x8FED03: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530==by 0x8FED03: executeNextItem (jsonpath_exec.c:1604) ==00:00:00:05.637 435530==by 0x8FCA58: executeItemOptUnwrapTarget (jsonpath_exec.c:956) ==00:00:00:05.637 435530==by 0x8FFDE4: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530==by 0x8FFDE4: executeJsonPath.constprop.30 (jsonpath_exec.c:612) ==00:00:00:05.637 435530==by 0x8FFF8C: jsonb_path_query_internal (jsonpath_exec.c:438) It's fairly obviously right about that: JsonbValuejbv; ... jb = Assert(tmp != NULL);/* We must have set tmp above */ jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); ^ Presumably, this is a mistaken attempt to test the type of the thing previously pointed to by "jb". On the whole, what I'd be inclined to do here is get rid of this test altogether and demand that every path through the preceding "switch" deliver a value that doesn't need pstrdup(). The only path that doesn't do that already is case jbvBool: tmp = (jb->val.boolean) ? "true" : "false"; break; and TBH I'm not sure that we really need a pstrdup there either. The constants are immutable enough. Is something likely to try to pfree the pointer later? I tried @@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, jb = Assert(tmp != NULL);/* We must have set tmp above */ - jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); + jb->val.string.val = tmp; jb->val.string.len = strlen(jb->val.string.val); jb->type = jbvString; and that quieted valgrind for this particular query and still passes regression. (The reported crashes seem to be happening later during a recursive invocation, seemingly because JsonbType(jb) is returning garbage. So there may be another bug after this one.) Argh! Will look, thanks. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PATCH] Add native windows on arm64 support
On 2024-01-25 Th 08:45, Dave Cramer wrote: I tried running my buildfarm using my git repo and my branch, but get the following error Status Line: 492 bad branch parameter Content: bad branch parameter fix_arm Web txn failed with status: 1 You can't use your own branch with the buildfarm, you need to let it set up its own branches. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: More new SQL/JSON item methods
Andrew Dunstan writes: > Thanks, I have pushed this. The buildfarm is pretty widely unhappy, mostly failing on select jsonb_path_query('1.23', '$.string()'); On a guess, I tried running that under valgrind, and behold it said ==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised value(s) ==00:00:00:05.637 435530==at 0x8FD131: executeItemOptUnwrapTarget (jsonpath_exec.c:1547) ==00:00:00:05.637 435530==by 0x8FED03: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530==by 0x8FED03: executeNextItem (jsonpath_exec.c:1604) ==00:00:00:05.637 435530==by 0x8FCA58: executeItemOptUnwrapTarget (jsonpath_exec.c:956) ==00:00:00:05.637 435530==by 0x8FFDE4: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530==by 0x8FFDE4: executeJsonPath.constprop.30 (jsonpath_exec.c:612) ==00:00:00:05.637 435530==by 0x8FFF8C: jsonb_path_query_internal (jsonpath_exec.c:438) It's fairly obviously right about that: JsonbValuejbv; ... jb = Assert(tmp != NULL);/* We must have set tmp above */ jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); ^ Presumably, this is a mistaken attempt to test the type of the thing previously pointed to by "jb". On the whole, what I'd be inclined to do here is get rid of this test altogether and demand that every path through the preceding "switch" deliver a value that doesn't need pstrdup(). The only path that doesn't do that already is case jbvBool: tmp = (jb->val.boolean) ? "true" : "false"; break; and TBH I'm not sure that we really need a pstrdup there either. The constants are immutable enough. Is something likely to try to pfree the pointer later? I tried @@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, jb = Assert(tmp != NULL);/* We must have set tmp above */ - jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); + jb->val.string.val = tmp; jb->val.string.len = strlen(jb->val.string.val); jb->type = jbvString; and that quieted valgrind for this particular query and still passes regression. (The reported crashes seem to be happening later during a recursive invocation, seemingly because JsonbType(jb) is returning garbage. So there may be another bug after this one.) regards, tom lane
Re: [PATCH] Add native windows on arm64 support
On Thu, 25 Jan 2024 at 12:30, Andrew Dunstan wrote: > > On 2024-01-24 We 19:02, Michael Paquier wrote: > > On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote: > > I managed to get it to build the vcvarsall arch needs to be x64. I need to > add some options, but the patch above needs to be applied to build it. > > Nice. If I may ask, what kind of host and/or configuration have you > used to reach a state where the code can be compiled and run tests > with meson? If you have found specific steps, it may be a good thing > to document that on the wiki, say around [1]. > > Perhaps you have not included TAP? It may be fine in terms of runtime > checks and coverage. > > [1]: > https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows > > > > I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we really > want to build with x64_arm64, i.e. to generate native arm64 binaries. > Setting just x64 will not do that, AIUI. > > I tried that with the buidfarm, setting that in the config file's call to > PGBuild::VSenv::getenv(). > > That upset msvc_gendef.pl, so I added this there to keep it happy: > > $arch = 'x86_64' if $arch eq 'aarch64'; > > After that things went ok until I got this: > > [1453/2088] "link" @src/backend/postgres.exe.rsp > FAILED: src/backend/postgres.exe src/backend/postgres.pdb > "link" @src/backend/postgres.exe.rsp >Creating library src\backend\postgres.exe.lib > storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol > _mm_pause referenced in function perform_spin_delay > src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals > > > I haven't made further progress, but I will return to it in the next day > or so. > > While this will be nice to have, I think it won't really matter until > there is ARM64 support in released versions of Windows Server. AFAICT they > still only sell versions for x86_64 > I need it to build clients. The clients need arm64 libraries to link against Dave
Re: A performance issue with Memoize
David Rowley writes: > I think fixing it your way makes sense. I don't really see any reason > why we should have two. However... > Another way it *could* be fixed would be to get rid of pull_paramids() > and change create_memoize_plan() to set keyparamids to all the param > IDs that match are equal() to each param_exprs. That way nodeMemoize > wouldn't purge the cache as we'd know the changing param is accounted > for in the cache. For the record, I don't think that's better, but it > scares me a bit less as I don't know what other repercussions there > are of applying your patch to get rid of the duplicate > NestLoopParam.paramval. > I'd feel better about doing it your way if Tom could comment on if > there was a reason he put the function calls that way around in > 5ebaaa494. I'm fairly sure I thought it wouldn't matter because of the Param de-duplication done in paramassign.c. However, Richard's example shows that's not so, because process_subquery_nestloop_params is picky about the param ID assigned to a particular Var while replace_nestloop_params is not. So flipping the order makes sense. I'd change the comment though, maybe to /* * Replace any outer-relation variables with nestloop params. * * We must provide nestloop params for both lateral references of * the subquery and outer vars in the scan_clauses. It's better * to assign the former first, because that code path requires * specific param IDs, while replace_nestloop_params can adapt * to the IDs assigned by process_subquery_nestloop_params. * This avoids possibly duplicating nestloop params when the * same Var is needed for both reasons. */ However ... it seems like we're not out of the woods yet. Why is Richard's proposed test case still showing + -> Memoize (actual rows=5000 loops=N) + Cache Key: t1.two, t1.two Seems like there is missing de-duplication logic, or something. > I also feel like we might be getting a bit close to the minor version > releases to be adjusting this stuff in back branches. Yeah, I'm not sure I would change this in the back branches. regards, tom lane
Re: [PATCH] Add native windows on arm64 support
On 2024-01-24 We 19:02, Michael Paquier wrote: On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote: I managed to get it to build the vcvarsall arch needs to be x64. I need to add some options, but the patch above needs to be applied to build it. Nice. If I may ask, what kind of host and/or configuration have you used to reach a state where the code can be compiled and run tests with meson? If you have found specific steps, it may be a good thing to document that on the wiki, say around [1]. Perhaps you have not included TAP? It may be fine in terms of runtime checks and coverage. [1]:https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we really want to build with x64_arm64, i.e. to generate native arm64 binaries. Setting just x64 will not do that, AIUI. I tried that with the buidfarm, setting that in the config file's call to PGBuild::VSenv::getenv(). That upset msvc_gendef.pl, so I added this there to keep it happy: $arch = 'x86_64' if $arch eq 'aarch64'; After that things went ok until I got this: [1453/2088] "link" @src/backend/postgres.exe.rsp FAILED: src/backend/postgres.exe src/backend/postgres.pdb "link" @src/backend/postgres.exe.rsp Creating library src\backend\postgres.exe.lib storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol _mm_pause referenced in function perform_spin_delay src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals I haven't made further progress, but I will return to it in the next day or so. While this will be nice to have, I think it won't really matter until there is ARM64 support in released versions of Windows Server. AFAICT they still only sell versions for x86_64 cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
On Thu, Jan 25, 2024 at 11:22 AM Alvaro Herrera wrote: > Still with these auto-tuning GUCs, I noticed that the auto-tuning code > would continue to grow the buffer sizes with shared_buffers to > arbitrarily large values. I added an arbitrary maximum of 1024 (8 MB), > which is much higher than the current value of 128; but if you have > (say) 30 GB of shared_buffers (not uncommon these days), do you really > need 30MB of pg_clog cache? It seems mostly unnecessary ... and you can > still set it manually that way if you need it. So, largely I just > rewrote those small functions completely. Yeah, I think that if we're going to scale with shared_buffers, it should be capped. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Emit fewer vacuum records by reaping removable tuples during pruning
On Thu, Jan 25, 2024 at 11:19 AM Melanie Plageman wrote: > Cool. I might add "successfully" or "fully" to "Either way, the page > hasn't been processed yet" I'm OK with that. > > > I think it would be nice to clarify this comment. I think the "when > > > there is little free space anyway" is referring to the case in > > > lazy_vacuum() where we set do_index_vacuuming to false because "there > > > are almost zero TIDs". I initially thought it was saying that in the > > > failsafe vacuum case the pages whose free space we wouldn't record in > > > the FSM have little free space anyway -- which I didn't get. But then > > > I looked at where we set do_index_vacuuming to false. > > > > Oh... wow. That's kind of confusing; somehow I was thinking we were > > talking about free space on the disk, rather than newly free space in > > pages that could be added to the FSM. > > Perhaps I misunderstood. I interpreted it to refer to the bypass optimization. I think you're probably correct. I just didn't realize what was meant. -- Robert Haas EDB: http://www.enterprisedb.com
Re: A performance issue with Memoize
David Rowley writes: > I'd feel better about doing it your way if Tom could comment on if > there was a reason he put the function calls that way around in > 5ebaaa494. Apologies for not having noticed this thread before. I'm taking a look at it now. However, while sniffing around this I found what seems like an oversight in paramassign.c's assign_param_for_var(): it says it should compare all the same fields as _equalVar except for varlevelsup, but it's failing to compare varnullingrels. Is that a bug? It's conceivable that it's not possible to get here with varnullingrels different and all else the same, but I don't feel good about that proposition. I tried adding @@ -91,7 +91,10 @@ assign_param_for_var(PlannerInfo *root, Var *var) pvar->vartype == var->vartype && pvar->vartypmod == var->vartypmod && pvar->varcollid == var->varcollid) +{ +Assert(bms_equal(pvar->varnullingrels, var->varnullingrels)); return pitem->paramId; +} } } This triggers no failures in the regression tests, but we know how little that proves. Anyway, that's just a side observation unrelated to the problem at hand. More later. regards, tom lane
Re: UUID v7
Aleksander, In this case the documentation must state that the functions uuid_extract_time() and uuidv7(T) are against the RFC requirements, and that developers may use these functions with caution at their own risk, and these functions are not recommended for production environment. The function uuidv7(T) is not better than uuid_extract_time(). Careless developers may well pass any business date into this function: document date, registration date, payment date, reporting date, start date of the current month, data download date, and even a constant. This would be a profanation of UUIDv7 with very negative consequences. Sergey prokhorenkosergeyprokhore...@yahoo.com.au On Thursday, 25 January 2024 at 03:06:50 pm GMT+3, Aleksander Alekseev wrote: Hi, > Postgres always was a bit hackerish, allowing slightly more then is safe. > I.e. you can define immutable function that is not really immutable, turn off > autovacuum or fsync. Why bother with safety guards here? > My opinion is that we should have this function to extract timestamp. Even if > it can return strange values for imprecise RFC implementation. Completely agree. Users that don't like or don't need it can pretend there are no uuid_extract_time() and uuidv7(T) in Postgres. If we don't provide them however, users that need them will end up writing their own probably buggy and not compatible implementations. That would be much worse. > So +1 for erroring when you provide a timestamp outside of that range > (either too far in the past or too far in the future). > > OK, it seems like we have some consensus on ERRORing.. > > Do we have any other open items? Does v13 address all open items? Maybe let’s > compose better error message? > > +1 for erroring when ts is outside range. > > v13 looks good for me. I think we have reached a optimal compromise. Andrey, many thanks for the updated patch. LGTM, cfbot is happy and I don't think we have any open items left. So changing CF entry status back to RfC. -- Best regards, Aleksander Alekseev
Re: index prefetching
On 1/25/24 11:45, Dilip Kumar wrote: > On Wed, Jan 24, 2024 at 11:43 PM Tomas Vondra > wrote: > >> On 1/22/24 07:35, Konstantin Knizhnik wrote: >>> >>> On 22/01/2024 1:47 am, Tomas Vondra wrote: h, right. Well, you're right in this case we perhaps could set just one of those flags, but the "purpose" of the two places is quite different. The "prefetch" flag is fully controlled by the prefetcher, and it's up to it to change it (e.g. I can easily imagine some new logic touching setting it to "false" for some reason). The "data" flag is fully controlled by the custom callbacks, so whatever the callback stores, will be there. I don't think it's worth simplifying this. In particular, I don't think the callback can assume it can rely on the "prefetch" flag. >>> Why not to add "all_visible" flag to IndexPrefetchEntry ? If will not >>> cause any extra space overhead (because of alignment), but allows to >>> avoid dynamic memory allocation (not sure if it is critical, but nice to >>> avoid if possible). >>> >> > While reading through the first patch I got some questions, I haven't > read it complete yet but this is what I got so far. > > 1. > +static bool > +IndexPrefetchBlockIsSequential(IndexPrefetch *prefetch, BlockNumber block) > +{ > + int idx; > ... > + if (prefetch->blockItems[idx] != (block - i)) > + return false; > + > + /* Don't prefetch if the block happens to be the same. */ > + if (prefetch->blockItems[idx] == block) > + return false; > + } > + > + /* not sequential, not recently prefetched */ > + return true; > +} > > The above function name is BlockIsSequential but at the end, it > returns true if it is not sequential, seem like a problem? Actually, I think it's the comment that's wrong - the last return is reached only for a sequential pattern (and when the block was not accessed recently). > Also other 2 checks right above the end of the function are returning > false if the block is the same or the pattern is sequential I think > those are wrong too. > Hmmm. You're right this is partially wrong. There are two checks: /* * For a sequential pattern, blocks "k" step ago needs to have block * number by "k" smaller compared to the current block. */ if (prefetch->blockItems[idx] != (block - i)) return false; /* Don't prefetch if the block happens to be the same. */ if (prefetch->blockItems[idx] == block) return false; The first condition is correct - we want to return "false" when the pattern is not sequential. But the second condition is wrong - we want to skip prefetching when the block was already prefetched recently, so this should return true (which is a bit misleading, as it seems to imply the pattern is sequential, when it's not). However, this is harmless, because we then identify this block as recently prefetched in the "full" cache check, so we won't prefetch it anyway. So it's harmless, although a bit more expensive. There's another inefficiency - we stop looking for the same block once we find the first block breaking the non-sequential pattern. Imagine a sequence of blocks 1, 2, 3, 1, 2, 3, ... in which case we never notice the block was recently prefetched, because we always find the break of the sequential pattern. But again, it's harmless, thanks to the full cache of recently prefetched blocks. > 2. > I have noticed that the prefetch history is maintained at the backend > level, but what if multiple backends are trying to fetch the same heap > blocks maybe scanning the same index, so should that be in some shared > structure? I haven't thought much deeper about this from the > implementation POV, but should we think about it, or it doesn't > matter? Yes, the cache is at the backend level - it's a known limitation, but I see it more as a conscious tradeoff. Firstly, while the LRU cache is at backend level, PrefetchBuffer also checks shared buffers for each prefetch request. So with sufficiently large shared buffers we're likely to find it there (and for direct I/O there won't be any other place to check). Secondly, the only other place to check is page cache, but there's no good (sufficiently cheap) way to check that. See the preadv2/nowait experiment earlier in this thread. I suppose we could implement a similar LRU cache for shared memory (and I don't think it'd be very complicated), but I did not plan to do that in this patch unless absolutely necessary. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Add new error_action COPY ON_ERROR "log"
Hi, As described in 9e2d870119, COPY ON_EEOR is expected to have more "error_action". (Note that option name was changed by b725b7eec) I'd like to have a new option "log", which skips soft errors and logs information that should have resulted in errors to PostgreSQL log. I think this option has some advantages like below: 1) We can know which number of line input data was not loaded and reason. Example: =# copy t1 from stdin with (on_error log); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal. >> 1 >> 2 >> 3 >> z >> \. LOG: invalid input syntax for type integer: "z" NOTICE: 1 row was skipped due to data type incompatibility COPY 3 =# \! tail data/log/postgresql*.log LOG: 22P02: invalid input syntax for type integer: "z" CONTEXT: COPY t1, line 4, column i: "z" LOCATION: pg_strtoint32_safe, numutils.c:620 STATEMENT: copy t1 from stdin with (on_error log); 2) Easier maintenance than storing error information in tables or proprietary log files. For example, in case a large number of soft errors occur, some mechanisms are needed to prevent an infinite increase in the size of the destination data, but we can left it to PostgreSQL's log rotation. Attached a patch. This basically comes from previous discussion[1] which did both "ignore" and "log" soft error. As shown in the example above, the log output to the client does not contain CONTEXT, so I'm a little concerned that client cannot see what line of the input data had a problem without looking at the server log. What do you think? [1] https://www.postgresql.org/message-id/c0fb57b82b150953f26a5c7e340412e8%40oss.nttdata.com -- Regards, -- Atsushi Torikoshi NTT DATA Group CorporationFrom 04e643facfea4b4e8dd174d22fbe5e008747a91a Mon Sep 17 00:00:00 2001 From: Atsushi Torikoshi Date: Fri, 26 Jan 2024 01:17:59 +0900 Subject: [PATCH v1] Add new error_action "log" to ON_ERROR option Currently ON_ERROR option only has "ignore" to skip malformed data and there are no ways to know where and why COPY skipped them. "log" skips malformed data as well as "ignore", but it logs information that should have resulted in errors to PostgreSQL log. --- doc/src/sgml/ref/copy.sgml | 8 ++-- src/backend/commands/copy.c | 4 +++- src/backend/commands/copyfrom.c | 24 src/include/commands/copy.h | 1 + src/test/regress/expected/copy2.out | 14 +- src/test/regress/sql/copy2.sql | 9 + 6 files changed, 48 insertions(+), 12 deletions(-) diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 21a5c4a052..9662c90a8b 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -380,12 +380,16 @@ COPY { table_name [ ( Specifies which error_action to perform when there is malformed data in the input. - Currently, only stop (default) and ignore - values are supported. + Currently, only stop (default), ignore + and log values are supported. If the stop value is specified, COPY stops operation at the first error. If the ignore value is specified, COPY skips malformed data and continues copying data. + If the log value is specified, + COPY behaves the same as ignore, exept that + it logs information that should have resulted in errors to PostgreSQL log at + INFO level. The option is allowed only in COPY FROM. Only stop value is allowed when using binary format. diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index cc0786c6f4..812ca63350 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -415,13 +415,15 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from) return COPY_ON_ERROR_STOP; /* - * Allow "stop", or "ignore" values. + * Allow "stop", "ignore" or "log" values. */ sval = defGetString(def); if (pg_strcasecmp(sval, "stop") == 0) return COPY_ON_ERROR_STOP; if (pg_strcasecmp(sval, "ignore") == 0) return COPY_ON_ERROR_IGNORE; + if (pg_strcasecmp(sval, "log") == 0) + return COPY_ON_ERROR_LOG; ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c index 1fe70b9133..7886bd5353 100644 --- a/src/backend/commands/copyfrom.c +++ b/src/backend/commands/copyfrom.c @@ -1013,6 +1013,23 @@ CopyFrom(CopyFromState cstate) */ cstate->escontext->error_occurred = false; + else if (cstate->opts.on_error == COPY_ON_ERROR_LOG) + { +/* Adjust elevel so we don't jump out */ +cstate->escontext->error_data->elevel = LOG; + +/* + * Despite the name, this won't raise an error since elevel is + * LOG now. + */ +ThrowErrorData(cstate->escontext->error_data); + +/* Initialize escontext in preparation for next soft error */
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
On 2024-Jan-25, Alvaro Herrera wrote: > Here's a touched-up version of this patch. > diff --git a/src/backend/storage/lmgr/lwlock.c > b/src/backend/storage/lmgr/lwlock.c > index 98fa6035cc..4a5e05d5e4 100644 > --- a/src/backend/storage/lmgr/lwlock.c > +++ b/src/backend/storage/lmgr/lwlock.c > @@ -163,6 +163,13 @@ static const char *const BuiltinTrancheNames[] = { > [LWTRANCHE_LAUNCHER_HASH] = "LogicalRepLauncherHash", > [LWTRANCHE_DSM_REGISTRY_DSA] = "DSMRegistryDSA", > [LWTRANCHE_DSM_REGISTRY_HASH] = "DSMRegistryHash", > + [LWTRANCHE_COMMITTS_SLRU] = "CommitTSSLRU", > + [LWTRANCHE_MULTIXACTOFFSET_SLRU] = "MultixactOffsetSLRU", > + [LWTRANCHE_MULTIXACTMEMBER_SLRU] = "MultixactMemberSLRU", > + [LWTRANCHE_NOTIFY_SLRU] = "NotifySLRU", > + [LWTRANCHE_SERIAL_SLRU] = "SerialSLRU" > + [LWTRANCHE_SUBTRANS_SLRU] = "SubtransSLRU", > + [LWTRANCHE_XACT_SLRU] = "XactSLRU", > }; Eeek. Last minute changes ... Fixed here. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "La primera ley de las demostraciones en vivo es: no trate de usar el sistema. Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 61038472c5..3e3119865a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2006,6 +2006,145 @@ include_dir 'conf.d' + + commit_timestamp_buffers (integer) + + commit_timestamp_buffers configuration parameter + + + + +Specifies the amount of memory to use to cache the contents of +pg_commit_ts (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 0, which requests +shared_buffers/512 up to 1024 blocks, +but not fewer than 16 blocks. +This parameter can only be set at server start. + + + + + + multixact_members_buffers (integer) + + multixact_members_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_multixact/members (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 32. +This parameter can only be set at server start. + + + + + + multixact_offsets_buffers (integer) + + multixact_offsets_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_multixact/offsets (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 16. +This parameter can only be set at server start. + + + + + + notify_buffers (integer) + + notify_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_notify (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 16. +This parameter can only be set at server start. + + + + + + serializable_buffers (integer) + + serializable_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_serial (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 32. +This parameter can only be set at server start. + + + + + + subtransaction_buffers (integer) + + subtransaction_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_subtrans (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 0, which requests +shared_buffers/512 up to 1024 blocks, +but not fewer than 16 blocks. +This parameter can only be set at server start. + + + + + + transaction_buffers (integer) + + transaction_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_xact (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +
Re: make dist using git archive
On Thu Jan 25, 2024 at 10:04 AM CST, Peter Eisentraut wrote: On 22.01.24 21:04, Tristan Partin wrote: >> + 'HEAD', '.'], >> + install: false, >> + output: distdir + '.tar.bz2', >> +) > > The bz2 target should be wrapped in an `if bzip2.found()`. The way that this currently works is that you will fail at configure time if bz2 doesn't exist on the system. Meson will try to resolve a .path() method on a NotFoundProgram. You might want to define the bz2 target to just call `exit 1` in this case. if bzip2.found() # do your current target else bz2 = run_target('tar.bz2', command: ['exit', 1]) endif This should cause pgdist to appropriately fail at run time when generating the bz2 tarball. Well, I think we want the dist step to fail if bzip2 is not there. At least that is the current expectation. >> +alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2]) > > Are you intending for the check_dirty_index target to prohibit the other > two targets from running? Currently that is not the case. Yes, that was the hope, and that's how the make dist variant works. But I couldn't figure this out with meson. Also, the above actually also doesn't work with older meson versions, so I had to comment this out to get CI to work. > If it is what > you intend, use a stamp file or something to indicate a relationship. > Alternatively, inline the git diff-index into the other commands. These > might also do better as external scripts. It would reduce duplication > between the autotools and Meson builds. Yeah, maybe that's a direction. For what it's worth, I run Meson 1.3, and the behavior of generating the tarballs even though it is a dirty tree still occurred. In the new patch you seem to say it was fixed in 0.60. -- Tristan Partin Neon (https://neon.tech)
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Here's a touched-up version of this patch. First, PROC_GLOBAL->clogGroupFirst and SlruCtl->latest_page_number change from being protected by locks to being atomics, but there's no mention of considering memory barriers that they should have. Looking at the code, I think the former doesn't need any additional barriers, but latest_page_number is missing some, which I have added. This deserves another careful look. Second and most user visibly, the GUC names seem to have been chosen based on the source-code variables, which have never meant to be user- visible. So I renamed a few: xact_buffers -> transaction_buffers subtrans_buffers -> subtransaction_buffers serial_buffers -> serializable_buffers commit_ts_buffers -> commit_timestamp_buffers (unchanged: multixact_offsets_buffers, multixact_members_buffers, notify_buffers) I did this explicitly trying to avoid using the term SLRU in a user-visible manner, because what do users care? But immediately after doing this I realized that we already have pg_stat_slru, so maybe the cat is already out of the bag, and so perhaps we should name these GUCS as, say slru_transaction_buffers? That may make the connection between these things a little more explicit. (I do think we need to add cross-links in the documentation of those GUCs to the pg_stat_slru docs and vice-versa.) Another thing that bothered me a bit is that we have auto-tuning for transaction_buffers and commit_timestamp_buffers, but not for subtransaction_buffers. (Autotuning means you set the GUC to 0 and it scales with shared_buffers). I don't quite understand what's the reason for the ommision, so I added it for subtrans too. I think it may make sense to do likewise for the multixact ones too, not sure. It doesn't seem worth having that for pg_serial and notify. While messing about with these GUCs I realized that the usage of the show_hook to print what the current number is, when autoturning is used, was bogus: SHOW would print the number of blocks for (say) transaction_buffers, but if you asked it to print (say) multixact_offsets_buffers, it would give a size in MB or kB. I'm sure such an inconsistency would bite us. So, digging around I found that a good way to handle this is to remove the show_hook, and instead call SetConfigOption() at the time when the ShmemInit function is called, with the correct number of buffers determined. This is pretty much what is used for XLOGbuffers, and it works correctly as far as my testing shows. Still with these auto-tuning GUCs, I noticed that the auto-tuning code would continue to grow the buffer sizes with shared_buffers to arbitrarily large values. I added an arbitrary maximum of 1024 (8 MB), which is much higher than the current value of 128; but if you have (say) 30 GB of shared_buffers (not uncommon these days), do you really need 30MB of pg_clog cache? It seems mostly unnecessary ... and you can still set it manually that way if you need it. So, largely I just rewrote those small functions completely. I also made the SGML documentation and postgresql.sample.conf all match what the code actually docs. The whole thing wasn't particularly consistent. I rewrote a bunch of code comments and moved stuff around to appear in alphabetical order, etc. More comment rewriting still pending. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 61038472c5..3e3119865a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2006,6 +2006,145 @@ include_dir 'conf.d' + + commit_timestamp_buffers (integer) + + commit_timestamp_buffers configuration parameter + + + + +Specifies the amount of memory to use to cache the contents of +pg_commit_ts (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 0, which requests +shared_buffers/512 up to 1024 blocks, +but not fewer than 16 blocks. +This parameter can only be set at server start. + + + + + + multixact_members_buffers (integer) + + multixact_members_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_multixact/members (see +). +If this value is specified without units, it is taken as blocks, +that is BLCKSZ bytes, typically 8kB. +The default value is 32. +This parameter can only be set at server start. + + + + + + multixact_offsets_buffers (integer) + + multixact_offsets_buffers configuration parameter + + + + +Specifies the amount of shared memory to use to cache the contents +of pg_multixact/offsets (see +
Re: Emit fewer vacuum records by reaping removable tuples during pruning
On Thu, Jan 25, 2024 at 10:19 AM Robert Haas wrote: > > On Thu, Jan 25, 2024 at 9:18 AM Melanie Plageman > wrote: > > > To me, the first paragraph of this one misses the mark. What I thought > > > we should be saying here was something like "If we don't have a > > > cleanup lock, the code above has already processed this page to the > > > extent that is possible. Otherwise, we either got the cleanup lock > > > initially and have not processed the page yet, or we didn't get it > > > initially, attempted to process it without the cleanup lock, and > > > decided we needed one after all. Either way, if we now have the lock, > > > we must prune, freeze, and count tuples." > > > > I see. Your suggestion makes sense. The sentence starting with > > "Otherwise" is a bit long. I started to lose the thread at "decided we > > needed one after all". You previously referred to the cleanup lock as > > "it" -- once you start referring to it as "one", I as the future > > developer am no longer sure we are talking about the cleanup lock (as > > opposed to the page or something else). > > Ok... trying again: > > If we have a cleanup lock, we must now prune, freeze, and count > tuples. We may have acquired the cleanup lock originally, or we may > have gone back and acquired it after lazy_scan_noprune() returned > false. Either way, the page hasn't been processed yet. Cool. I might add "successfully" or "fully" to "Either way, the page hasn't been processed yet" > > I think it would be nice to clarify this comment. I think the "when > > there is little free space anyway" is referring to the case in > > lazy_vacuum() where we set do_index_vacuuming to false because "there > > are almost zero TIDs". I initially thought it was saying that in the > > failsafe vacuum case the pages whose free space we wouldn't record in > > the FSM have little free space anyway -- which I didn't get. But then > > I looked at where we set do_index_vacuuming to false. > > Oh... wow. That's kind of confusing; somehow I was thinking we were > talking about free space on the disk, rather than newly free space in > pages that could be added to the FSM. Perhaps I misunderstood. I interpreted it to refer to the bypass optimization. > And it seems really questionable > whether that case is OK. I mean, in the emergency case, fine, > whatever, we should do whatever it takes to get the system back up, > and it should barely ever happen on a well-configured system. But this > case could happen regularly, and losing track of free space could > easily cause bloat. > > This might be another argument for moving FSM updates to the first > heap pass, but that's a separate task from fixing the comment. Yes, it seems we could miss recording space freed in the first pass if we never end up doing a second pass. consider_bypass_optimization is set to false only if index cleanup is explicitly enabled or there are dead items accumulated for vacuum's second pass at some point. - Melanie
Re: cleanup patches for incremental backup
On Thu, Jan 25, 2024 at 10:06:41AM -0500, Robert Haas wrote: > On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart > wrote: >> That seems like a reasonable starting point. Even if it doesn't help >> determine the root cause, it should at least help rule out concurrent >> summary removal. > > Here is a patch for that. LGTM. The only thing I might add is the cutoff_time in that LOG. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: make dist using git archive
On 22.01.24 21:04, Tristan Partin wrote: +git = find_program('git', required: false, native: true, disabler: true) +bzip2 = find_program('bzip2', required: false, native: true, disabler: true) This doesn't need to be a disabler. git is fine as-is. See later comment. Disablers only work like you are expecting when they are used like how git is used. Once you call a method like .path(), all bets are off. ok, fixed +distdir = meson.project_name() + '-' + meson.project_version() + +check_dirty_index = run_target('check-dirty-index', + command: [git, 'diff-index', '--quiet', 'HEAD']) Seems like you might want to add -C here too? done +tar_bz2 = custom_target('tar.bz2', + build_always_stale: true, + command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' + bzip2.path() + ' -c', 'archive', + '--format', 'tar.bz2', + '--prefix', distdir + '/', - '-o', '@BUILD_ROOT@/@OUTPUT@', + '-o', join_paths(meson.build_root(), '@OUTPUT@'), This will generate the tarballs in the build directory. Do the same for the previous target. Tested locally. Fixed, thanks. I had struggled with this one. + 'HEAD', '.'], + install: false, + output: distdir + '.tar.bz2', +) The bz2 target should be wrapped in an `if bzip2.found()`. Well, I think we want the dist step to fail if bzip2 is not there. At least that is the current expectation. +alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2]) Are you intending for the check_dirty_index target to prohibit the other two targets from running? Currently that is not the case. Yes, that was the hope, and that's how the make dist variant works. But I couldn't figure this out with meson. Also, the above actually also doesn't work with older meson versions, so I had to comment this out to get CI to work. If it is what you intend, use a stamp file or something to indicate a relationship. Alternatively, inline the git diff-index into the other commands. These might also do better as external scripts. It would reduce duplication between the autotools and Meson builds. Yeah, maybe that's a direction. The updated patch also supports vpath builds with make now. I have also added a CI patch, for amusement. Maybe we'll want to keep it, though.From 13612f9a1c486e8acfe4156a7bc069a66fd30e77 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Thu, 25 Jan 2024 12:28:28 +0100 Subject: [PATCH v1 1/2] make dist uses git archive Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org --- GNUmakefile.in | 34 -- meson.build| 41 + 2 files changed, 53 insertions(+), 22 deletions(-) diff --git a/GNUmakefile.in b/GNUmakefile.in index eba569e930e..680c765dd73 100644 --- a/GNUmakefile.in +++ b/GNUmakefile.in @@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers submake-libpgport distdir= postgresql-$(VERSION) dummy = =install= +GIT = git + dist: $(distdir).tar.gz $(distdir).tar.bz2 - rm -rf $(distdir) - -$(distdir).tar: distdir - $(TAR) chf $@ $(distdir) - -.INTERMEDIATE: $(distdir).tar - -distdir-location: - @echo $(distdir) - -distdir: - rm -rf $(distdir)* $(dummy) - for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -print`; do \ - file=`expr X$$x : 'X\./\(.*\)'`; \ - if test -d "$(top_srcdir)/$$file" ; then \ - mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \ - else \ - ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \ - || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \ - fi || exit; \ - done - $(MAKE) -C $(distdir) distclean + +.PHONY: check-dirty-index +check-dirty-index: + $(GIT) -C $(srcdir) diff-index --quiet HEAD + +$(distdir).tar.gz: check-dirty-index + $(GIT) -C $(srcdir) archive --format tar.gz --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@ + +$(distdir).tar.bz2: check-dirty-index + $(GIT) -C $(srcdir) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@ distcheck: dist rm -rf $(dummy) diff --git a/meson.build b/meson.build index 55184db2488..43884e7cfdd 100644 --- a/meson.build +++ b/meson.build @@ -3348,6 +3348,47 @@ run_target('help', +### +# Distribution archive +### + +git = find_program('git', required: false, native: true) +bzip2 = find_program('bzip2', required: false, native: true) + +distdir = meson.project_name() + '-' + meson.project_version() + +check_dirty_index = run_target('check-dirty-index', + command: [git, '-C', '@SOURCE_ROOT@', +
Re: Patch: Improve Boolean Predicate JSON Path Docs
"David E. Wheeler" writes: > On Jan 24, 2024, at 16:32, Tom Lane wrote: >> + >> + Predicate check expressions are required in the >> + @@ operator (and the >> + jsonb_path_match function), and should not be >> used >> + with the @? operator (or the >> + jsonb_path_exists function). >> + >> + >> + > I had this bit here: > >Conversely, non-predicate jsonpath expressions should not > be >used with the @@ operator (or the >jsonb_path_match function). > I changed the preceding para to say "... check expressions are required in ...", which I thought was sufficient to cover that. Also, the tabular description of the operator tells you not to do it. > What do you think of also dropping the article from all the references to > “the strict mode” or “the lax mode”, to make them “strict mode” and “lax > mode”, respectively? Certainly most of 'em don't need it. I'll make it so. regards, tom lane
Re: A compiling warning in jsonb_populate_record_valid
Amit Langote writes: > On Fri, Jan 26, 2024 at 0:15 Tom Lane wrote: >> Amit Langote writes: >>> Ignoring the warning was my 1st thought too, because an old discussion I >>> found about the warning was too old (2011). The style I adopted in my >>> “fix” is used in a few other places too, so I thought I might as well >>> go for it. >> Oh, well maybe I'm missing some context. What comparable places did >> you see? > Sorry, not on my computer right now so can’t paste any code, but I was able > to find a couple of functions (using Google ;) that refer to error_occurred > directly for one reason or another: OK, looking around, it seems like our pattern is that direct access to escontext.error_occurred is okay in the same function that sets up the escontext (so that there's no possibility of a NULL pointer). So this change is fine and I withdraw my objection. Sorry for the noise. regards, tom lane
Re: More new SQL/JSON item methods
On 2024-01-18 Th 09:25, Jeevan Chalke wrote: On Thu, Jan 18, 2024 at 1:03 AM Peter Eisentraut wrote: On 17.01.24 10:03, Jeevan Chalke wrote: > I added unary '+' and '-' support as well and thus thought of having > separate rules altogether rather than folding those in. > > Per SQL standard, the precision and scale arguments are unsigned > integers, so unary plus and minus signs are not supported. So my patch > removes that support, but I didn't adjust the regression tests for that. > > > However, PostgreSQL numeric casting does support a negative scale. Here > is an example: > > # select '12345'::numeric(4,-2); > numeric > - > 12300 > (1 row) > > And thus thought of supporting those. > Do we want this JSON item method to behave differently here? Ok, it would make sense to support this in SQL/JSON as well. OK. So with this, we don't need changes done in your 0001 patches. > I will merge them all into one and will try to keep them in the order > specified in sql_features.txt. > However, for documentation, it makes more sense to keep them in logical > order than the alphabetical one. What are your views on this? The documentation can be in a different order. Thanks, Andrew and Peter for the confirmation. Attached merged single patch along these lines. Thanks, I have pushed this. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: A compiling warning in jsonb_populate_record_valid
On Fri, Jan 26, 2024 at 0:15 Tom Lane wrote: > Amit Langote writes: > > On Thu, Jan 25, 2024 at 23:57 Tom Lane wrote: > >> -1 please. We should not break that abstraction for the sake > >> of ignorable warnings on ancient compilers. > > > Ignoring the warning was my 1st thought too, because an old discussion I > > found about the warning was too old (2011). The style I adopted in my > > “fix” is used in a few other places too, so I thought I might as well > > go for it. > > Oh, well maybe I'm missing some context. What comparable places did > you see? Sorry, not on my computer right now so can’t paste any code, but I was able to find a couple of functions (using Google ;) that refer to error_occurred directly for one reason or another: process_integer_literal(), xml_is_document()
Re: Unnecessary smgropen in {heapam_relation,index}_copy_data?
On Thu, 25 Jan 2024 at 21:43, Aleksander Alekseev wrote: > Hi, > >> I find heapam_relation_copy_data() and index_copy_data() have the following >> code: >> >> dstrel = smgropen(*newrlocator, rel->rd_backend); >> >> ... >> >> RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, >> true); >> >> The smgropen() is also called by RelationCreateStorage(), why should we call >> smgropen() explicitly here? >> >> I try to remove the smgropen(), and all tests passed. > > That's a very good question. Note that the second argument of > smgropen() used to create dstrel changes after applying your patch. > I'm not 100% sure whether this is significant or not. > Thanks for the review. According the comments of RelationData->rd_backend, it is the backend id, if the relation is temporary. The differnece is RelationCreateStorage() uses relpersistence to determinate the backend id. > I added your patch to the nearest open commitfest so that we will not lose it: > > https://commitfest.postgresql.org/47/4794/ Thank you.
Re: Emit fewer vacuum records by reaping removable tuples during pruning
On Thu, Jan 25, 2024 at 9:18 AM Melanie Plageman wrote: > > To me, the first paragraph of this one misses the mark. What I thought > > we should be saying here was something like "If we don't have a > > cleanup lock, the code above has already processed this page to the > > extent that is possible. Otherwise, we either got the cleanup lock > > initially and have not processed the page yet, or we didn't get it > > initially, attempted to process it without the cleanup lock, and > > decided we needed one after all. Either way, if we now have the lock, > > we must prune, freeze, and count tuples." > > I see. Your suggestion makes sense. The sentence starting with > "Otherwise" is a bit long. I started to lose the thread at "decided we > needed one after all". You previously referred to the cleanup lock as > "it" -- once you start referring to it as "one", I as the future > developer am no longer sure we are talking about the cleanup lock (as > opposed to the page or something else). Ok... trying again: If we have a cleanup lock, we must now prune, freeze, and count tuples. We may have acquired the cleanup lock originally, or we may have gone back and acquired it after lazy_scan_noprune() returned false. Either way, the page hasn't been processed yet. > I think it would be nice to clarify this comment. I think the "when > there is little free space anyway" is referring to the case in > lazy_vacuum() where we set do_index_vacuuming to false because "there > are almost zero TIDs". I initially thought it was saying that in the > failsafe vacuum case the pages whose free space we wouldn't record in > the FSM have little free space anyway -- which I didn't get. But then > I looked at where we set do_index_vacuuming to false. Oh... wow. That's kind of confusing; somehow I was thinking we were talking about free space on the disk, rather than newly free space in pages that could be added to the FSM. And it seems really questionable whether that case is OK. I mean, in the emergency case, fine, whatever, we should do whatever it takes to get the system back up, and it should barely ever happen on a well-configured system. But this case could happen regularly, and losing track of free space could easily cause bloat. This might be another argument for moving FSM updates to the first heap pass, but that's a separate task from fixing the comment. > As for the last sentence starting with "Besides", even with Peter's > explanation I still am not sure what it should say. There are blocks > whose free space we don't record in the first heap pass. Then, due to > skipping index vacuuming and the second heap pass, we also don't > record their free space in the second heap pass. I think he is saying > that once we set do_index_vacuuming to false, we will stop skipping > updating the FSM after the first pass for future blocks. So, future > blocks will have their free space recorded in the FSM. But that feels > self-evident. Yes, I don't think that necessarily needs to be mentioned here. > The more salient point is that there are some blocks > whose free space is not recorded (those whose first pass happened > before unsetting do_index_vacuuming and whose second pass did not > happen before do_index_vacuuming is unset). The extra sentence made me > think there was some way we might go back and record free space for > those blocks, but that is not true. I don't really see why that sentence made you think that, but it's not important. I agree with you about what point we need to emphasize here. -- Robert Haas EDB: http://www.enterprisedb.com
Re: A compiling warning in jsonb_populate_record_valid
Amit Langote writes: > On Thu, Jan 25, 2024 at 23:57 Tom Lane wrote: >> -1 please. We should not break that abstraction for the sake >> of ignorable warnings on ancient compilers. > Ignoring the warning was my 1st thought too, because an old discussion I > found about the warning was too old (2011). The style I adopted in my > “fix” is used in a few other places too, so I thought I might as well > go for it. Oh, well maybe I'm missing some context. What comparable places did you see? regards, tom lane
Re: A compiling warning in jsonb_populate_record_valid
On Thu, Jan 25, 2024 at 23:57 Tom Lane wrote: > Amit Langote writes: > > On Thu, Jan 25, 2024 at 2:59 PM Richard Guo > wrote: > >> I came across a warning when building master (a044e61f1b) on old GCC > >> (4.8.5). > > > Will apply the attached, which does this: > > > - return BoolGetDatum(!SOFT_ERROR_OCCURRED()); > > + return BoolGetDatum(!escontext.error_occurred); > > -1 please. We should not break that abstraction for the sake > of ignorable warnings on ancient compilers. Ignoring the warning was my 1st thought too, because an old discussion I found about the warning was too old (2011). The style I adopted in my “fix” is used in a few other places too, so I thought I might as well go for it. Anyway, I’m fine with reverting. >
Re: cleanup patches for incremental backup
On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart wrote: > That seems like a reasonable starting point. Even if it doesn't help > determine the root cause, it should at least help rule out concurrent > summary removal. Here is a patch for that. -- Robert Haas EDB: http://www.enterprisedb.com v1-0001-Temporary-patch-to-help-debug-pg_walsummary-test-.patch Description: Binary data
Re: A compiling warning in jsonb_populate_record_valid
Amit Langote writes: > On Thu, Jan 25, 2024 at 2:59 PM Richard Guo wrote: >> I came across a warning when building master (a044e61f1b) on old GCC >> (4.8.5). > Will apply the attached, which does this: > - return BoolGetDatum(!SOFT_ERROR_OCCURRED()); > + return BoolGetDatum(!escontext.error_occurred); -1 please. We should not break that abstraction for the sake of ignorable warnings on ancient compilers. regards, tom lane
Re: remaining sql/json patches
On Thu, Jan 25, 2024 at 7:54 PM Amit Langote wrote: > > > > > The problem with returning comp_domain_with_typmod from json_value() > > seems to be that it's using a text-to-record CoerceViaIO expression > > picked from JsonExpr.item_coercions, which behaves differently than > > the expression tree that the following uses: > > > > select ('abcd', 42)::comp_domain_with_typmod; > >row > > -- > > (abc,42) > > (1 row) > > Oh, it hadn't occurred to me to check what trying to coerce a "string" > containing the record literal would do: > > select '(''abcd'', 42)'::comp_domain_with_typmod; > ERROR: value too long for type character(3) > LINE 1: select '(''abcd'', 42)'::comp_domain_with_typmod; > > which is the same thing as what the JSON_QUERY() and JSON_VALUE() are > running into. So, it might be fair to think that the error is not a > limitation of the SQL/JSON patch but an underlying behavior that it > has to accept as is. > Hi, I reconciled with these cases. What bugs me now is the first query of the following 4 cases (for comparison). SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) omit quotes); SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) keep quotes); SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text omit quotes); SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text keep quotes); I did some minor refactoring on the function coerceJsonFuncExprOutput. it will make the following queries return null instead of error. NULL is the return of json_value. SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int2); SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int4); SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int8); SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING bool); SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING numeric); SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING real); SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING float8); v1-0001-minor-refactor-coerceJsonFuncExprOutput.no-cfbot Description: Binary data
Re: Emit fewer vacuum records by reaping removable tuples during pruning
On Thu, Jan 25, 2024 at 8:57 AM Robert Haas wrote: > > On Wed, Jan 24, 2024 at 9:13 PM Melanie Plageman > wrote: > > v12 attached has my attempt at writing better comments for this > > section of lazy_scan_heap(). > > + /* > + * If we didn't get the cleanup lock and the page is not new or empty, > + * we can still collect LP_DEAD items in the dead_items array for > + * later vacuuming, count live and recently dead tuples for vacuum > + * logging, and determine if this block could later be truncated. If > + * we encounter any xid/mxids that require advancing the > + * relfrozenxid/relminxid, we'll have to wait for a cleanup lock and > + * call lazy_scan_prune(). > + */ > > I like this comment. I would probably drop "and the page is not new or > empty" from it since that's really speaking to the previous bit of > code, but it wouldn't be the end of the world to keep it, either. Yes, probably best to get rid of the part about new or empty. > /* > - * Prune, freeze, and count tuples. > + * If we got a cleanup lock, we can prune and freeze tuples and > + * defragment the page. If we didn't get a cleanup lock, we will still > + * consider whether or not to update the FSM. > * > - * Accumulates details of remaining LP_DEAD line pointers on page in > - * dead_items array. This includes LP_DEAD line pointers that we > - * pruned ourselves, as well as existing LP_DEAD line pointers that > - * were pruned some time earlier. Also considers freezing XIDs in the > - * tuple headers of remaining items with storage. It also determines > - * if truncating this block is safe. > + * Like lazy_scan_noprune(), lazy_scan_prune() will count > + * recently_dead_tuples and live tuples for vacuum logging, determine > + * if the block can later be truncated, and accumulate the details of > + * remaining LP_DEAD line pointers on the page in the dead_items > + * array. These dead items include those pruned by lazy_scan_prune() > + * as well we line pointers previously marked LP_DEAD. > */ > > To me, the first paragraph of this one misses the mark. What I thought > we should be saying here was something like "If we don't have a > cleanup lock, the code above has already processed this page to the > extent that is possible. Otherwise, we either got the cleanup lock > initially and have not processed the page yet, or we didn't get it > initially, attempted to process it without the cleanup lock, and > decided we needed one after all. Either way, if we now have the lock, > we must prune, freeze, and count tuples." I see. Your suggestion makes sense. The sentence starting with "Otherwise" is a bit long. I started to lose the thread at "decided we needed one after all". You previously referred to the cleanup lock as "it" -- once you start referring to it as "one", I as the future developer am no longer sure we are talking about the cleanup lock (as opposed to the page or something else). > - * Note: It's not in fact 100% certain that we really will call > - * lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip index > - * vacuuming (and so must skip heap vacuuming). This is deemed okay > - * because it only happens in emergencies, or when there is very > - * little free space anyway. (Besides, we start recording free space > - * in the FSM once index vacuuming has been abandoned.) > > Here's a suggestion from me: > > Note: In corner cases, it's possible to miss updating the FSM > entirely. If index vacuuming is currently enabled, we'll skip the FSM > update now. But if failsafe mode is later activated, disabling index > vacuuming, there will also be no opportunity to update the FSM later, > because we'll never revisit this page. Since updating the FSM is > desirable but not absolutely required, that's OK. > > I think this expresses the same sentiment as the current comment, but > IMHO more clearly. The one part of the current comment that I don't > understand at all is the remark about "when there is very little > freespace anyway". I get that if the failsafe activates we won't come > back to the page, which is the "only happens in emergencies" part of > the existing comment. But the current phrasing makes it sound like > there is a second case where it can happen -- "when there is very > little free space anyway" -- and I don't know what that is talking > about. If it's important, we should try to make it clearer. > > We could also just decide to keep this entire paragraph as it is for > purposes of the present patch. The part I really thought needed > adjusting was "Prune, freeze, and count tuples." I think it would be nice to clarify this comment. I think the "when there is little free space anyway" is referring to the case in lazy_vacuum() where we set do_index_vacuuming to false because "there are almost zero TIDs". I initially thought it was saying that in the failsafe vacuum case the pages whose free space we wouldn't record in the FSM have little free space anyway -- which I didn't get. But then I looked
Re: Emit fewer vacuum records by reaping removable tuples during pruning
On Wed, Jan 24, 2024 at 9:13 PM Melanie Plageman wrote: > v12 attached has my attempt at writing better comments for this > section of lazy_scan_heap(). + /* + * If we didn't get the cleanup lock and the page is not new or empty, + * we can still collect LP_DEAD items in the dead_items array for + * later vacuuming, count live and recently dead tuples for vacuum + * logging, and determine if this block could later be truncated. If + * we encounter any xid/mxids that require advancing the + * relfrozenxid/relminxid, we'll have to wait for a cleanup lock and + * call lazy_scan_prune(). + */ I like this comment. I would probably drop "and the page is not new or empty" from it since that's really speaking to the previous bit of code, but it wouldn't be the end of the world to keep it, either. /* - * Prune, freeze, and count tuples. + * If we got a cleanup lock, we can prune and freeze tuples and + * defragment the page. If we didn't get a cleanup lock, we will still + * consider whether or not to update the FSM. * - * Accumulates details of remaining LP_DEAD line pointers on page in - * dead_items array. This includes LP_DEAD line pointers that we - * pruned ourselves, as well as existing LP_DEAD line pointers that - * were pruned some time earlier. Also considers freezing XIDs in the - * tuple headers of remaining items with storage. It also determines - * if truncating this block is safe. + * Like lazy_scan_noprune(), lazy_scan_prune() will count + * recently_dead_tuples and live tuples for vacuum logging, determine + * if the block can later be truncated, and accumulate the details of + * remaining LP_DEAD line pointers on the page in the dead_items + * array. These dead items include those pruned by lazy_scan_prune() + * as well we line pointers previously marked LP_DEAD. */ To me, the first paragraph of this one misses the mark. What I thought we should be saying here was something like "If we don't have a cleanup lock, the code above has already processed this page to the extent that is possible. Otherwise, we either got the cleanup lock initially and have not processed the page yet, or we didn't get it initially, attempted to process it without the cleanup lock, and decided we needed one after all. Either way, if we now have the lock, we must prune, freeze, and count tuples." The second paragraph seems fine. - * Note: It's not in fact 100% certain that we really will call - * lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip index - * vacuuming (and so must skip heap vacuuming). This is deemed okay - * because it only happens in emergencies, or when there is very - * little free space anyway. (Besides, we start recording free space - * in the FSM once index vacuuming has been abandoned.) Here's a suggestion from me: Note: In corner cases, it's possible to miss updating the FSM entirely. If index vacuuming is currently enabled, we'll skip the FSM update now. But if failsafe mode is later activated, disabling index vacuuming, there will also be no opportunity to update the FSM later, because we'll never revisit this page. Since updating the FSM is desirable but not absolutely required, that's OK. I think this expresses the same sentiment as the current comment, but IMHO more clearly. The one part of the current comment that I don't understand at all is the remark about "when there is very little freespace anyway". I get that if the failsafe activates we won't come back to the page, which is the "only happens in emergencies" part of the existing comment. But the current phrasing makes it sound like there is a second case where it can happen -- "when there is very little free space anyway" -- and I don't know what that is talking about. If it's important, we should try to make it clearer. We could also just decide to keep this entire paragraph as it is for purposes of the present patch. The part I really thought needed adjusting was "Prune, freeze, and count tuples." -- Robert Haas EDB: http://www.enterprisedb.com
Re: [PATCH] Add native windows on arm64 support
On Thu, 25 Jan 2024 at 08:31, Dave Cramer wrote: > > > On Wed, 24 Jan 2024 at 19:03, Michael Paquier wrote: > >> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote: >> > I managed to get it to build the vcvarsall arch needs to be x64. I need >> to >> > add some options, but the patch above needs to be applied to build it. >> >> Nice. If I may ask, what kind of host and/or configuration have you >> used to reach a state where the code can be compiled and run tests >> with meson? If you have found specific steps, it may be a good thing >> to document that on the wiki, say around [1]. >> > > I am running a mac-mini M2 with parallels running windows pro 11 > The only thing required is this patch. Running in a native developer > prompt > > meson setup build --prefix=c:\postgres > cd build > ninja > ninja install > ninja test > > all run without errors > I also have buildfarm running and will run that today > > Dave > >> >> Perhaps you have not included TAP? It may be fine in terms of runtime >> checks and coverage. >> > > Does TAP have to be explicitly added to the meson build ? > > Dave > I tried running my buildfarm using my git repo and my branch, but get the following error Status Line: 492 bad branch parameter Content: bad branch parameter fix_arm Web txn failed with status: 1
Re: Unnecessary smgropen in {heapam_relation,index}_copy_data?
Hi, > I find heapam_relation_copy_data() and index_copy_data() have the following > code: > > dstrel = smgropen(*newrlocator, rel->rd_backend); > > ... > > RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, > true); > > The smgropen() is also called by RelationCreateStorage(), why should we call > smgropen() explicitly here? > > I try to remove the smgropen(), and all tests passed. That's a very good question. Note that the second argument of smgropen() used to create dstrel changes after applying your patch. I'm not 100% sure whether this is significant or not. I added your patch to the nearest open commitfest so that we will not lose it: https://commitfest.postgresql.org/47/4794/ -- Best regards, Aleksander Alekseev
Re: [PATCH] Add native windows on arm64 support
On Wed, 24 Jan 2024 at 19:03, Michael Paquier wrote: > On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote: > > I managed to get it to build the vcvarsall arch needs to be x64. I need > to > > add some options, but the patch above needs to be applied to build it. > > Nice. If I may ask, what kind of host and/or configuration have you > used to reach a state where the code can be compiled and run tests > with meson? If you have found specific steps, it may be a good thing > to document that on the wiki, say around [1]. > I am running a mac-mini M2 with parallels running windows pro 11 The only thing required is this patch. Running in a native developer prompt meson setup build --prefix=c:\postgres cd build ninja ninja install ninja test all run without errors I also have buildfarm running and will run that today Dave > > Perhaps you have not included TAP? It may be fine in terms of runtime > checks and coverage. > Does TAP have to be explicitly added to the meson build ? Dave
Re: Use of backup_label not noted in log
Hi, On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: > I would still advocate for a back patch here. It is frustrating to get logs > from users that just say: > > LOG: invalid checkpoint record > PANIC: could not locate a valid checkpoint record > > It would be very helpful to know what the checkpoint record LSN was in this > case. I agree. Michael
Re: Remove unused fields in ReorderBufferTupleBuf
Hi, > > Here is the corrected patch. > > Thank you for updating the patch! I have some comments: Thanks for the review. > -tuple = (ReorderBufferTupleBuf *) > +tuple = (HeapTuple) > MemoryContextAlloc(rb->tup_context, > - > sizeof(ReorderBufferTupleBuf) + > + HEAPTUPLESIZE + > MAXIMUM_ALIGNOF + > alloc_len); > -tuple->alloc_tuple_size = alloc_len; > -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple); > +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE); > > Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the > MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a > similar thing but it doesn't add it. Indeed. I gave it a try and nothing crashed, so it appears that MAXIMUM_ALIGNOF is not needed. > --- > xl_parameter_change *xlrec = > -(xl_parameter_change *) > XLogRecGetData(buf->record); > +(xl_parameter_change *) > XLogRecGetData(buf->record); > > Unnecessary change. That's pgindent. Fixed. > --- > -/* > - * Free a ReorderBufferTupleBuf. > - */ > -void > -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf *tuple) > -{ > -pfree(tuple); > -} > - > > Why does ReorderBufferReturnTupleBuf need to be moved from > reorderbuffer.c to reorderbuffer.h? It seems not related to this > refactoring patch so I think we should do it in a separate patch if we > really want it. I'm not sure it's necessary, though. OK, fixed. -- Best regards, Aleksander Alekseev v6-0001-Remove-ReorderBufferTupleBuf-structure.patch Description: Binary data
Re: Use of backup_label not noted in log
On 1/25/24 04:12, Michael Paquier wrote: On Mon, Jan 22, 2024 at 04:36:27PM +0900, Michael Paquier wrote: + if (ControlFile->backupStartPoint != InvalidXLogRecPtr) Nit 1: I would use XLogRecPtrIsInvalid here. + ereport(LOG, + (errmsg("completed backup recovery with redo LSN %X/%X", + LSN_FORMAT_ARGS(oldBackupStartPoint; Nit 2: How about adding backupEndPoint in this LOG? That would give: "completed backup recovery with redo LSN %X/%X and end LSN %X/%X". Hearing nothing, I've just applied a version of the patch with these two modifications on HEAD. If this needs tweaks, just let me know. I had planned to update the patch this morning -- so thanks for doing that. I think having the end point in the message makes perfect sense. I would still advocate for a back patch here. It is frustrating to get logs from users that just say: LOG: invalid checkpoint record PANIC: could not locate a valid checkpoint record It would be very helpful to know what the checkpoint record LSN was in this case. Regards, -David
Re: UUID v7
Hi, > Andrey, many thanks for the updated patch. > > LGTM, cfbot is happy and I don't think we have any open items left. So > changing CF entry status back to RfC. PFA v14. I changed: ``` elog(ERROR, "Time argument of UUID v7 cannot exceed 6 bytes"); ``` ... to: ``` ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("Time argument of UUID v7 is outside of the valid range"))); ``` Which IMO tells a bit more to the average user and is translatable. > At a quick glance, the patch needs improving English, IMO. Agree. We could use some help from a native English speaker for this. -- Best regards, Aleksander Alekseev v14-0001-Implement-UUID-v7.patch Description: Binary data