Hi, On Tue, Aug 27, 2024 at 8:01 PM Antonin Houska <a...@cybertec.at> wrote: > > Attached is version 2, the feature itself is now in 0004. > > Unlike version 1, it contains some regression tests (0006) and a new GUC to > control how long the AccessExclusiveLock may be held (0007). > > Kirill Reshke <reshkekir...@gmail.com> wrote: > > > On Fri, 2 Aug 2024 at 11:09, Antonin Houska <a...@cybertec.at> wrote: > > > > > > Kirill Reshke <reshkekir...@gmail.com> wrote: > > > > However, in general, the 3rd patch is really big, very hard to > > > > comprehend. Please consider splitting this into smaller (and > > > > reviewable) pieces. > > > > > > I'll try to move some preparation steps into separate diffs, but not sure > > > if > > > that will make the main diff much smaller. I prefer self-contained > > > patches, as > > > also explained in [3]. > > > > Thanks for sharing [3], it is a useful link. > > > > There is actually one more case when ACCESS EXCLUSIVE is held: during > > table rewrite (AT set TAM, AT set Tablespace and AT alter column type > > are some examples). > > This can be done CONCURRENTLY too, using the same logical replication > > approach, or do I miss something? > > Yes, the logical replication can potentially be used in other cases. > > > I'm not saying we must do it immediately, this should be a separate > > thread, but we can do some preparation work here. > > > > I can see that a bunch of functions which are currently placed in > > cluster.c can be moved to something like > > logical_rewrite_heap.c. ConcurrentChange struct and > > apply_concurrent_insert function is one example of such. > > > > So, if this is the case, 0003 patch can be splitted in two: > > The first one is general utility code for logical table rewrite > > The second one with actual VACUUM CONCURRENTLY feature. > > > What do you think? > > I can imagine moving the function process_concurrent_changes() and subroutines > to a different file (e.g. rewriteheap.c), but moving it into a separate diff > that does not contain any call of the function makes little sense to me. Such > a diff would not add any useful functionality and could not be considered > refactoring either. > > So far I at least moved some code to separate diffs: 0003 and 0005. I'll move > more if I find sensible opportunity in the future. > > -- > Antonin Houska > Web: https://www.cybertec-postgresql.com >
Thanks for working on this, I think this is a very useful feature. The patch doesn't compile in the debug build with errors: ../postgres/src/backend/commands/cluster.c: In function ‘get_catalog_state’: ../postgres/src/backend/commands/cluster.c:2771:33: error: declaration of ‘td_src’ shadows a previous local [-Werror=shadow=compatible-local] 2771 | TupleDesc td_src, td_dst; | ^~~~~~ ../postgres/src/backend/commands/cluster.c:2741:25: note: shadowed declaration is here 2741 | TupleDesc td_src = RelationGetDescr(rel); you forgot the meson build for pgoutput_cluster diff --git a/src/backend/meson.build b/src/backend/meson.build index 78c5726814..0f9141a4ac 100644 --- a/src/backend/meson.build +++ b/src/backend/meson.build @@ -194,5 +194,6 @@ pg_test_mod_args = pg_mod_args + { subdir('jit/llvm') subdir('replication/libpqwalreceiver') subdir('replication/pgoutput') +subdir('replication/pgoutput_cluster') I noticed that you use lmode/lock_mode/lockmode, there are lmode and lockmode in the codebase, but I remember someone proposed all changes to lockmode, how about sticking to lockmode in your patch? 0004: + sure that the old files do not change during the processing because the + chnages would get lost due to the swap. typo + files. The data changes that took place during the creation of the new + table and index files are captured using logical decoding + (<xref linkend="logicaldecoding"/>) and applied before + the <literal>ACCESS EXCLUSIVE</literal> lock is requested. Thus the lock + is typically held only for the time needed to swap the files, which + should be pretty short. I remember pg_squeeze also did some logical decoding after getting the exclusive lock, if that is still true, I guess the doc above is not precise. + Note that <command>CLUSTER</command> with the + the <literal>CONCURRENTLY</literal> option does not try to order the + rows inserted into the table after the clustering started. Do you mean after the *logical decoding* started here? If CLUSTER CONCURRENTLY does not order rows at all, why bother implementing it? + errhint("CLUSTER CONCURRENTLY is only allowed for permanent relations"))); errhint messages should end with a dot. Why hardcoded to "CLUSTER CONCURRENTLY" instead of parameter *stmt*. + ResourceOwner oldowner = CurrentResourceOwner; + + /* + * In the CONCURRENT case, do the planning in a subtrensaction so that typo I did not see VacuumStmt changes in gram.y, how do we suppose to use the vacuum full concurrently? I tried the following but no success. [local] postgres@demo:5432-36097=# vacuum (concurrently) aircrafts_data; ERROR: CONCURRENTLY can only be specified with VACUUM FULL [local] postgres@demo:5432-36097=# vacuum full (concurrently) full aircrafts_data; ERROR: syntax error at or near "(" LINE 1: vacuum full (concurrently) full aircrafts_data; -- Regards Junwang Zhao