Re: GenBKI emits useless open;close for catalogs without rows
On Tue, Apr 9, 2024 at 12:03 AM Michael Paquier wrote: > Hmm, is that productive? This patch has been waiting on author since > the 1st of February, and it was already moved from the CF 2024-01 to > 2024-03. It would make more sense to me to mark it as RwF, then > resubmit if there is still interest in working on this topic rather > than move it again. Done. -- Robert Haas EDB: http://www.enterprisedb.com
Re: GenBKI emits useless open;close for catalogs without rows
On Mon, Apr 08, 2024 at 11:11:07AM +0300, Andrey M. Borodin wrote: > This is kind reminder that this thread is waiting for your response. > CF entry [0] is in "Waiting on Author", I'll move it to July CF. Hmm, is that productive? This patch has been waiting on author since the 1st of February, and it was already moved from the CF 2024-01 to 2024-03. It would make more sense to me to mark it as RwF, then resubmit if there is still interest in working on this topic rather than move it again. My personal inner rule is there is enough ground for a patch to be marked as RwF if it has been waiting on author since the middle of a commit fest, which would be the 15th of March for CF 2024-03. This lets two weeks to authors to react. -- Michael signature.asc Description: PGP signature
Re: GenBKI emits useless open;close for catalogs without rows
> On 22 Sep 2023, at 18:50, Matthias van de Meent > wrote: Hi Matthias! This is kind reminder that this thread is waiting for your response. CF entry [0] is in "Waiting on Author", I'll move it to July CF. Thanks! Best regards, Andrey Borodin. [0] https://commitfest.postgresql.org/47/4544/
Re: GenBKI emits useless open;close for catalogs without rows
On Wed, 8 Nov 2023 at 12:50, Peter Eisentraut wrote: > > On 08.11.23 08:16, Peter Eisentraut wrote: > > On 19.09.23 20:05, Heikki Linnakangas wrote: > >> One thing caught my eye though: We currently have an "open" command > >> after every "create". Except for bootstrap relations; creating a > >> bootstrap relation opens it implicitly. That seems like a weird > >> inconsistency. If we make "create" to always open the relation, we can > >> both make it more consistent and save a few lines. That's maybe worth > >> doing, per the attached. It removes the "open" command altogether, as > >> it's not needed anymore. > > > > This seems like a good improvement to me. > > > > It would restrict the bootstrap language so that you can only manipulate > > a table right after creating it, but I don't see why that wouldn't be > > sufficient. > > Then again, this sort of achieves the opposite of what Matthias was > aiming for: You are now forcing some relations to be opened even though > we will end up closing it right away. > > (In any case, documentation in bki.sgml would need to be updated for > this patch.) I have changed the status of the patch to WOA, feel free to update the status once Peter's documentation comments are addressed. Regards, Vignesh
Re: GenBKI emits useless open;close for catalogs without rows
On 08.11.23 08:16, Peter Eisentraut wrote: On 19.09.23 20:05, Heikki Linnakangas wrote: One thing caught my eye though: We currently have an "open" command after every "create". Except for bootstrap relations; creating a bootstrap relation opens it implicitly. That seems like a weird inconsistency. If we make "create" to always open the relation, we can both make it more consistent and save a few lines. That's maybe worth doing, per the attached. It removes the "open" command altogether, as it's not needed anymore. This seems like a good improvement to me. It would restrict the bootstrap language so that you can only manipulate a table right after creating it, but I don't see why that wouldn't be sufficient. Then again, this sort of achieves the opposite of what Matthias was aiming for: You are now forcing some relations to be opened even though we will end up closing it right away. (In any case, documentation in bki.sgml would need to be updated for this patch.)
Re: GenBKI emits useless open;close for catalogs without rows
On 19.09.23 20:05, Heikki Linnakangas wrote: One thing caught my eye though: We currently have an "open" command after every "create". Except for bootstrap relations; creating a bootstrap relation opens it implicitly. That seems like a weird inconsistency. If we make "create" to always open the relation, we can both make it more consistent and save a few lines. That's maybe worth doing, per the attached. It removes the "open" command altogether, as it's not needed anymore. This seems like a good improvement to me. It would restrict the bootstrap language so that you can only manipulate a table right after creating it, but I don't see why that wouldn't be sufficient.
Re: GenBKI emits useless open;close for catalogs without rows
On Fri, 22 Sept 2023 at 00:25, Andres Freund wrote: > > Hi, > > On 2023-09-19 21:05:41 +0300, Heikki Linnakangas wrote: > > On 18/09/2023 17:50, Matthias van de Meent wrote: > > > (initdb takes about 73ms locally with syncing disabled) > > > > That's impressive. It takes about 600 ms on my laptop. Of which about 140 ms > > goes into processing the BKI file. And that's with "initdb -no-sync" option. > > I think there must be a digit missing in Matthias' numbers. Yes, kind of. The run was on 50 iterations, not the assumed 250. Also note that the improved measurements were recorded inside the boostrap-mode PostgreSQL instance, not inside the initdb that was processing the postgres.bki file. So it might well be that I didn't improve the total timing by much. > > > Various methods of reducing the size of postgres.bki were applied, as > > > detailed in the patch's commit message. I believe the current output > > > is still quite human readable. > > > > Overall this does not seem very worthwhile to me. > > Because the wins are too small? > > FWIW, Making postgres.bki smaller and improving bootstrapping time does seem > worthwhile to me. But it doesn't seem quite right to handle the batching in > the file format, it should be on the server side, no? The main reason I did batching in the file format is to reduce the storage overhead of the current one "INSERT" per row. Batching improved that by replacing the token with a different construct, but it's not neccessarily the only solution. The actual parser still inserts the tuples one by one in the relation, as I didn't spend time on making a simple_heap_insert analog for bulk insertions. > We really should stop emitting WAL during initdb... I think it's quite elegant that we're able to bootstrap the relation data of a new PostgreSQL cluster from the WAL generated in another cluster, even if it is indeed a bit wasteful. I do see your point though - the WAL shouldn't be needed if we're already fsyncing the files to disk. Kind regards, Matthias van de Meent Neon (https://neon.tech)
Re: GenBKI emits useless open;close for catalogs without rows
On Tue, 19 Sept 2023 at 20:05, Heikki Linnakangas wrote: > > On 18/09/2023 17:50, Matthias van de Meent wrote: > > (initdb takes about 73ms locally with syncing disabled) > > That's impressive. It takes about 600 ms on my laptop. Of which about > 140 ms goes into processing the BKI file. And that's with "initdb > -no-sync" option. Hmm, yes, I misinterpreted my own benchmark setup, the actual value would be somewhere around 365ms: I thought I was doing 50*50 runs in one timed run, but really I was doing only 50 runs. TO add insult to injury, I divided the total time taken by 250 instead of either 50 or 2500... Thanks for correcting me on that. > > Various methods of reducing the size of postgres.bki were applied, as > > detailed in the patch's commit message. I believe the current output > > is still quite human readable. > > Overall this does not seem very worthwhile to me. Reducing the size of redistributables sounds worthwhile to me, but if none of these changes are worth the effort, then alright, nothing gained, only time lost. > Looking at "perf" profile of initdb, I also noticed that a small but > measurable amount of time is spent in the "isatty(0)" call in do_end(). > Does anyone care about doing bootstrap mode interactively? We could > probably remove that. Yeah, that sounds like a good idea. Kind regards, Matthias van de Meent
Re: GenBKI emits useless open;close for catalogs without rows
Hi, On 2023-09-19 21:05:41 +0300, Heikki Linnakangas wrote: > On 18/09/2023 17:50, Matthias van de Meent wrote: > > (initdb takes about 73ms locally with syncing disabled) > > That's impressive. It takes about 600 ms on my laptop. Of which about 140 ms > goes into processing the BKI file. And that's with "initdb -no-sync" option. I think there must be a digit missing in Matthias' numbers. > > Various methods of reducing the size of postgres.bki were applied, as > > detailed in the patch's commit message. I believe the current output > > is still quite human readable. > > Overall this does not seem very worthwhile to me. Because the wins are too small? FWIW, Making postgres.bki smaller and improving bootstrapping time does seem worthwhile to me. But it doesn't seem quite right to handle the batching in the file format, it should be on the server side, no? We really should stop emitting WAL during initdb... > Looking at "perf" profile of initdb, I also noticed that a small but > measurable amount of time is spent in the "isatty(0)" call in do_end(). Does > anyone care about doing bootstrap mode interactively? We could probably > remove that. Heh, yea, that's pretty pointless. Greetings, Andres Freund
Re: GenBKI emits useless open;close for catalogs without rows
On 18/09/2023 17:50, Matthias van de Meent wrote: (initdb takes about 73ms locally with syncing disabled) That's impressive. It takes about 600 ms on my laptop. Of which about 140 ms goes into processing the BKI file. And that's with "initdb -no-sync" option. Various methods of reducing the size of postgres.bki were applied, as detailed in the patch's commit message. I believe the current output is still quite human readable. Overall this does not seem very worthwhile to me. One thing caught my eye though: We currently have an "open" command after every "create". Except for bootstrap relations; creating a bootstrap relation opens it implicitly. That seems like a weird inconsistency. If we make "create" to always open the relation, we can both make it more consistent and save a few lines. That's maybe worth doing, per the attached. It removes the "open" command altogether, as it's not needed anymore. Looking at "perf" profile of initdb, I also noticed that a small but measurable amount of time is spent in the "isatty(0)" call in do_end(). Does anyone care about doing bootstrap mode interactively? We could probably remove that. -- Heikki Linnakangas Neon (https://neon.tech) diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y index 81a1b7bfec..043ad9325f 100644 --- a/src/backend/bootstrap/bootparse.y +++ b/src/backend/bootstrap/bootparse.y @@ -99,7 +99,7 @@ static int num_columns_read = 0; /* NULLVAL is a reserved keyword */ %token NULLVAL /* All the rest are unreserved, and should be handled in boot_ident! */ -%token OPEN XCLOSE XCREATE INSERT_TUPLE +%token XCLOSE XCREATE INSERT_TUPLE %token XDECLARE INDEX ON USING XBUILD INDICES UNIQUE XTOAST %token OBJ_ID XBOOTSTRAP XSHARED_RELATION XROWTYPE_OID %token XFORCE XNOT XNULL @@ -119,8 +119,7 @@ Boot_Queries: ; Boot_Query : - Boot_OpenStmt - | Boot_CloseStmt + Boot_CloseStmt | Boot_CreateStmt | Boot_InsertStmt | Boot_DeclareIndexStmt @@ -129,20 +128,11 @@ Boot_Query : | Boot_BuildIndsStmt ; -Boot_OpenStmt: - OPEN boot_ident -{ - do_start(); - boot_openrel($2); - do_end(); -} - ; - Boot_CloseStmt: XCLOSE boot_ident { do_start(); - closerel($2); + boot_closerel($2); do_end(); } ; @@ -185,17 +175,17 @@ Boot_CreateStmt: */ mapped_relation = ($4 || shared_relation); + if (boot_reldesc) + { + elog(DEBUG4, "create: warning, open relation exists, closing first"); + boot_closerel(NULL); + } + if ($4) { TransactionId relfrozenxid; MultiXactId relminmxid; - if (boot_reldesc) - { - elog(DEBUG4, "create bootstrap: warning, open relation exists, closing first"); - closerel(NULL); - } - boot_reldesc = heap_create($2, PG_CATALOG_NAMESPACE, shared_relation ? GLOBALTABLESPACE_OID : 0, @@ -239,6 +229,7 @@ Boot_CreateStmt: InvalidOid, NULL); elog(DEBUG4, "relation created with OID %u", id); + boot_openrel(id); } do_end(); } @@ -466,7 +457,6 @@ boot_column_val: boot_ident: ID { $$ = $1; } - | OPEN { $$ = pstrdup($1); } | XCLOSE { $$ = pstrdup($1); } | XCREATE { $$ = pstrdup($1); } | INSERT_TUPLE { $$ = pstrdup($1); } diff --git a/src/backend/bootstrap/bootscanner.l b/src/backend/bootstrap/bootscanner.l index 6a9d4193f2..efb9e36090 100644 --- a/src/backend/bootstrap/bootscanner.l +++ b/src/backend/bootstrap/bootscanner.l @@ -71,8 +71,6 @@ sid \'([^']|\'\')*\' %% -open { boot_yylval.kw = "open"; return OPEN; } - close { boot_yylval.kw = "close"; return XCLOSE; } create { boot_yylval.kw = "create"; return XCREATE; } diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c index 5810f8825e..2fc01f9d4d 100644 --- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -409,13 +409,10 @@ bootstrap_signals(void) * */ void -boot_openrel(char *relname) +boot_openrel(Oid id) { int i; - if (strlen(relname) >= NAMEDATALEN) - relname[NAMEDATALEN - 1] = '\0'; - /* * pg_type must be filled before any OPEN command is executed, hence we * can now populate Typ if we haven't yet. @@ -424,12 +421,12 @@ boot_openrel(char *relname) populate_typ_list(); if (boot_reldesc != NULL) - closerel(NULL); + boot_closerel(NULL); - elog(DEBUG4, "open relation %s, attrsize %d", - relname, (int) ATTRIBUTE_FIXED_PART_SIZE); + elog(DEBUG4, "open relation %u, attrsize %d", + id, (int) ATTRIBUTE_FIXED_PART_SIZE); - boot_reldesc = table_openrv(makeRangeVar(NULL, relname, -1), NoLock); + boot_reldesc = table_open(id, NoLock); numattr = RelationGetNumberOfAttributes(boot_reldesc); for (i = 0; i < numattr; i++) { @@ -450,11 +447,11 @@ boot_openrel(char *relname) } /* - * closerel + * boot_closerel *
Re: GenBKI emits useless open;close for catalogs without rows
On Tue, 12 Sept 2023 at 17:51, Matthias van de Meent wrote: > > On Fri, 1 Sept 2023 at 19:52, Tom Lane wrote: > > > > Alvaro Herrera writes: > > > On 2023-Sep-01, Matthias van de Meent wrote: > > >> A potential addition to the patch would to stop manually closing > > >> relations: initdb and check-world succeed without manual 'close' > > >> operations because the 'open' command auto-closes the previous open > > >> relation (in boot_openrel). Testing also suggests that the last opened > > >> relation apparently doesn't need closing - check-world succeeds > > >> without issues (incl. with TAP enabled). That is therefore implemented > > >> in attached patch 2 - it removes the 'close' syntax in its entirety. > > > > > Hmm, what happens with the last relation in the bootstrap process? Is > > > closerel() called via some other path for that one? > > > > Taking a quick census of existing closerel() callers: there is > > cleanup() in bootstrap.c, but it's called uncomfortably late > > and outside any transaction, so I misdoubt that it works > > properly if asked to actually shoulder any responsibility. > > (A little code reshuffling could fix that.) > > There are also a couple of low-level elog warnings in CREATE > > that would likely get triggered, though I suppose we could just > > remove those elogs. > > Yes, that should be easy to fix. > > > I guess my reaction to this patch is "why bother?". It seems > > unlikely to yield any measurable benefit, though of course > > that guess could be wrong. > > There is a small but measurable decrease in size of the generated bki > (2kb with both patches, on an initial 945kB), and there is some > related code that can be eliminated. If that's not worth bothering, > then I can drop the patch. Otherwise, I can update the patch to do the > cleanup that was within the transaction boundaries at the end of > boot_yyparse. > > If decreasing the size of postgres.bki is not worth the effort, I'll > drop any effort on doing so, but considering that it is about 1MB of > our uncompressed distributables, I'd say decreases in size are worth > the effort, most of the time. With the attached patch I've see a significant decrease in the size of postgres.bki of about 25%, and a likely related decrease in wall clock time spent in the bootstrap transaction: with timestamp logs inserted around the boot_yyparse() transaction the measured time went from around 49 ms on master to around 45 ms patched. In the grand scheme of initdb that might not be a lot of time (initdb takes about 73ms locally with syncing disabled) but it is a nice gain in performance. Comparison: master @ 9c13b681 $ du -b pg_install/share/postgres.bki 945220 $ initdb --no-instructions --auth=md5 --pwfile pwfile -N -D ~/test-dbinit/ [...] 2023-09-16 02:22:57.339 CEST [10422] LOG: Finished bootstrapping: to_start: 10 ms, transaction: 49 ms, finishing: 1 ms, total: 59 ms [...] patched $ du -b pg_install/share/postgres.bki 702574 $ initdb --no-instructions --auth=md5 --pwfile pwfile -N -D ~/test-dbinit/ [...] 2023-09-16 02:25:57.664 CEST [15645] LOG: Finished bootstrapping: to_start: 10 ms, transaction: 45 ms, finishing: 1 ms, total: 54 ms [...] Various methods of reducing the size of postgres.bki were applied, as detailed in the patch's commit message. I believe the current output is still quite human readable. There are other potential avenues for further reducing the bki size, e.g. through using smaller generated OIDs (reducing the number of characters used per OID), applying RLE on sequential NULLs (there are 3k+ occurances of /( __){2,10}/ in the generated bki file remaining), and other tricks, but several of those are likely to be detrimental to the readability and manual verifiability of the bki. Kind regards, Matthias van de Meent v2-0001-Update-BKI-syntax-bootstrap-performance.patch Description: Binary data
Re: GenBKI emits useless open;close for catalogs without rows
On Fri, 1 Sept 2023 at 19:52, Tom Lane wrote: > > Alvaro Herrera writes: > > On 2023-Sep-01, Matthias van de Meent wrote: > >> A potential addition to the patch would to stop manually closing > >> relations: initdb and check-world succeed without manual 'close' > >> operations because the 'open' command auto-closes the previous open > >> relation (in boot_openrel). Testing also suggests that the last opened > >> relation apparently doesn't need closing - check-world succeeds > >> without issues (incl. with TAP enabled). That is therefore implemented > >> in attached patch 2 - it removes the 'close' syntax in its entirety. > > > Hmm, what happens with the last relation in the bootstrap process? Is > > closerel() called via some other path for that one? > > Taking a quick census of existing closerel() callers: there is > cleanup() in bootstrap.c, but it's called uncomfortably late > and outside any transaction, so I misdoubt that it works > properly if asked to actually shoulder any responsibility. > (A little code reshuffling could fix that.) > There are also a couple of low-level elog warnings in CREATE > that would likely get triggered, though I suppose we could just > remove those elogs. Yes, that should be easy to fix. > I guess my reaction to this patch is "why bother?". It seems > unlikely to yield any measurable benefit, though of course > that guess could be wrong. There is a small but measurable decrease in size of the generated bki (2kb with both patches, on an initial 945kB), and there is some related code that can be eliminated. If that's not worth bothering, then I can drop the patch. Otherwise, I can update the patch to do the cleanup that was within the transaction boundaries at the end of boot_yyparse. If decreasing the size of postgres.bki is not worth the effort, I'll drop any effort on doing so, but considering that it is about 1MB of our uncompressed distributables, I'd say decreases in size are worth the effort, most of the time. Kind regards, Matthias van de Meent
Re: GenBKI emits useless open;close for catalogs without rows
Alvaro Herrera writes: > On 2023-Sep-01, Matthias van de Meent wrote: >> A potential addition to the patch would to stop manually closing >> relations: initdb and check-world succeed without manual 'close' >> operations because the 'open' command auto-closes the previous open >> relation (in boot_openrel). Testing also suggests that the last opened >> relation apparently doesn't need closing - check-world succeeds >> without issues (incl. with TAP enabled). That is therefore implemented >> in attached patch 2 - it removes the 'close' syntax in its entirety. > Hmm, what happens with the last relation in the bootstrap process? Is > closerel() called via some other path for that one? Taking a quick census of existing closerel() callers: there is cleanup() in bootstrap.c, but it's called uncomfortably late and outside any transaction, so I misdoubt that it works properly if asked to actually shoulder any responsibility. (A little code reshuffling could fix that.) There are also a couple of low-level elog warnings in CREATE that would likely get triggered, though I suppose we could just remove those elogs. I guess my reaction to this patch is "why bother?". It seems unlikely to yield any measurable benefit, though of course that guess could be wrong. regards, tom lane
Re: GenBKI emits useless open;close for catalogs without rows
On Fri, 1 Sept 2023 at 19:43, Alvaro Herrera wrote: > > On 2023-Sep-01, Matthias van de Meent wrote: > > > A potential addition to the patch would to stop manually closing > > relations: initdb and check-world succeed without manual 'close' > > operations because the 'open' command auto-closes the previous open > > relation (in boot_openrel). Testing also suggests that the last opened > > relation apparently doesn't need closing - check-world succeeds > > without issues (incl. with TAP enabled). That is therefore implemented > > in attached patch 2 - it removes the 'close' syntax in its entirety. > > Hmm, what happens with the last relation in the bootstrap process? Is > closerel() called via some other path for that one? There is a final cleanup() call that closes the last open boot_reldesc relation (if any) at the end of BootstrapModeMain, after boot_yyparse has completed and its changes have been committed. - Matthias
Re: GenBKI emits useless open;close for catalogs without rows
On 2023-Sep-01, Matthias van de Meent wrote: > A potential addition to the patch would to stop manually closing > relations: initdb and check-world succeed without manual 'close' > operations because the 'open' command auto-closes the previous open > relation (in boot_openrel). Testing also suggests that the last opened > relation apparently doesn't need closing - check-world succeeds > without issues (incl. with TAP enabled). That is therefore implemented > in attached patch 2 - it removes the 'close' syntax in its entirety. Hmm, what happens with the last relation in the bootstrap process? Is closerel() called via some other path for that one? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ Essentially, you're proposing Kevlar shoes as a solution for the problem that you want to walk around carrying a loaded gun aimed at your foot. (Tom Lane)
GenBKI emits useless open;close for catalogs without rows
Hi, Whilst looking at PostgreSQL's bootstrapping process, I noticed that postgres.bki contains quite a few occurrances of the pattern "open $catname; close $catname". I suppose this pattern isn't too expensive, but according to my limited research a combined open+close cycle doens't do anything meaningful, so it does waste some CPU cycles in the process. The attached patch 1 removes the occurances of those combined open/close statements in postgresql.bki. Locally it passes check-world, so I assume that opening and closing a table is indeed not required for initializing a data-less catalog during bootstrapping. A potential addition to the patch would to stop manually closing relations: initdb and check-world succeed without manual 'close' operations because the 'open' command auto-closes the previous open relation (in boot_openrel). Testing also suggests that the last opened relation apparently doesn't need closing - check-world succeeds without issues (incl. with TAP enabled). That is therefore implemented in attached patch 2 - it removes the 'close' syntax in its entirety. Kind regards, Matthias van de Meent Neon (https://neon.tech) 0002-Remove-the-bki-close-command.patch Description: Binary data 0001-Stop-emitting-open-nodata-close-patterns-in-genbki.p.patch Description: Binary data